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Preface 


Quantum theory is one of the most difficult subjects in the physics curriculum. 
In part this is because of unfamiliar mathematics: partial differential equations, 
Fourier transforms, complex vector spaces with inner products. But there is also 
the problem of relating mathematical objects, such as wave functions, to the phys- 
ical reality they are supposed to represent. In some sense this second problem is 
more serious than the first, for even the founding fathers of quantum theory had a 
great deal of difficulty understanding the subject in physical terms. The usual ap- 
proach found in textbooks is to relate mathematics and physics through the concept 
of a measurement and an associated wave function collapse. However, this does 
not seem very satisfactory as the foundation for a fundamental physical theory. 
Most professional physicists are somewhat uncomfortable with using the concept 
of measurement in this way, while those who have looked into the matter in greater 
detail, as part of their research into the foundations of quantum mechanics, are 
well aware that employing measurement as one of the building blocks of the sub- 
ject raises at least as many, and perhaps more, conceptual difficulties than it solves. 

It is in fact not necessary to interpret quantum mechanics in terms of measure- 
ments. The primary mathematical constructs of the theory, that is to say wave 
functions (or, to be more precise, subspaces of the Hilbert space), can be given 
a direct physical interpretation whether or not any process of measurement is in- 
volved. Doing this in a consistent way yields not only all the insights provided 
in the traditional approach through the concept of measurement, but much more 
besides, for it makes it possible to think in a sensible way about quantum systems 
which are not being measured, such as unstable particles decaying in the center 
of the earth, or in intergalactic space. Achieving a consistent interpretation is not 
easy, because one is constantly tempted to import the concepts of classical physics, 
which fit very well with the mathematics of classical mechanics, into the quantum 
domain where they sometimes work, but are often in conflict with the very different 
mathematical structure of Hilbert space that underlies quantum theory. The result 
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of using classical concepts where they do not belong is to generate contradictions 
and paradoxes of the sort which, especially in more popular expositions of the sub- 
ject, make quantum physics seem magical. Magic may be good for entertainment, 
but the resulting confusion is not very helpful to students trying to understand the 
subject for the first time, or to more mature scientists who want to apply quantum 
principles to a new domain where there is not yet a well-established set of princi- 
ples for carrying out and interpreting calculations, or to philosophers interested in 
the implications of quantum theory for broader questions about human knowledge 
and the nature of the world. 

The basic problem which must be solved in constructing a rational approach 
to quantum theory that is not based upon measurement as a fundamental princi- 
ple is to introduce probabilities and stochastic processes as part of the founda- 
tions of the subject, and not just an ad hoc and somewhat embarrassing addition to 
Schrodinger’s equation. Tools for doing this in a consistent way compatible with 
the mathematics of Hilbert space first appeared in the scientific research literature 
about fifteen years ago. Since then they have undergone further developments and 
refinements although, as with almost all significant scientific advances, there have 
been some serious mistakes on the part of those involved in the new developments, 
as well as some serious misunderstandings on the part of their critics. However, the 
resulting formulation of quantum principles, generally known as consistent histo- 
ries (or as decoherent histories ), appears to be fundamentally sound. It is concep- 
tually and mathematically “clean”: there are a small set of basic principles, not a 
host of ad hoc rules needed to deal with particular cases. And it provides a rational 
resolution to a number of paradoxes and dilemmas which have troubled some of 
the foremost quantum physicists of the twentieth century. 

The purpose of this book is to present the basic principles of quantum theory 
with the probabilistic structure properly integrated with Schrodinger dynamics in 
a coherent way which will be accessible to serious students of the subject (and 
their teachers). The emphasis is on physical interpretation, and for this reason 
I have tried to keep the mathematics as simple as possible, emphasizing finite- 
dimensional vector spaces and making considerable use of what I call “toy models.” 
They are a sort of quantum counterpart to the massless and frictionless pulleys 
of introductory classical mechanics; they make it possible to focus on essential 
issues of physics without being distracted by too many details. This approach 
may seem simplistic, but when properly used it can yield, at least for a certain 
class of problems, a lot more physical insight for a given expenditure of time than 
either numerical calculations or perturbation theory, and it is particularly useful for 
resolving a variety of confusing conceptual issues. 

An overview of the contents of the book will be found in the first chapter. In 
brief, there are two parts: the essentials of quantum theory, in Chs. 2-16, and 
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a variety of applications, including measurements and paradoxes, in Chs. 17-27. 
References to the literature have (by and large) been omitted from the main text, 
and will be found, along with a few suggestions for further reading, in the bibli- 
ography. In order to make the book self-contained I have included, without giving 
proofs, those essential concepts of linear algebra and probability theory which are 
needed in order to obtain a basic understanding of quantum mechanics. The level 
of mathematical difficulty is comparable to, or at least not greater than, what one 
finds in advanced undergraduate or beginning graduate courses in quantum theory. 

That the book is self-contained does not mean that reading it in isolation from 
other material constitutes a good way for someone with no prior knowledge to 
learn the subject. To begin with, there is no reference to the basic phenomenol- 
ogy of blackbody radiation, the photoelectric effect, atomic spectra, etc., which 
provided the original motivation for quantum theory and still form a very impor- 
tant part of the physical framework of the subject. Also, there is no discussion 
of a number of standard topics, such as the hydrogen atom, angular momentum, 
harmonic oscillator wave functions, and perturbation theory, which are part of the 
usual introductory course. For both of these I can with a clear conscience refer the 
reader to the many introductory textbooks which provide quite adequate treatments 
of these topics. Instead, I have concentrated on material which is not yet found in 
textbooks (hopefully that situation will change), but is very important if one wants 
to have a clear understanding of basic quantum principles. 


It is a pleasure to acknowledge help from a large number of sources. First, I 
am indebted to my fellow consistent historians, in particular Murray Gell-Mann, 
James Hartle, and Roland Omnes, from whom I have learned a great deal over the 
years. My own understanding of the subject, and therefore this book, owes much to 
their insights. Next, I am indebted to a number of critics, including Angelo Bassi, 
Bernard d’Espagnat, Fay Dowker, GianCarlo Ghirardi, Basil Hiley, Adrian Kent, 
and the late Euan Squires, whose challenges, probing questions, and serious efforts 
to evaluate the claims of the consistent historians have forced me to rethink my own 
ideas and also the manner in which they have been expressed. Over a number of 
years I have taught some of the material in the following chapters in both advanced 
undergraduate and introductory graduate courses, and the questions and reactions 
by the students and others present at my lectures have done much to clarify my 
thinking and (I hope) improve the quality of the presentation. 

I am grateful to a number of colleagues who read and commented on parts of the 
manuscript. David Mermin, Roland Omnes, and Abner Shimony looked at partic- 
ular chapters, while Todd Brun, Oliver Cohen, and David Collins read drafts of the 
entire manuscript. As well as uncovering many mistakes, they made a large number 
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of suggestions for improving the text, some though not all of which I adopted. For 
this reason (and in any case) whatever errors of commission or omission are present 
in the final version are entirely my responsibility. 

I am grateful for the financial support of my research provided by the National 
Science Foundation through its Physics Division, and for a sabbatical year from 
my duties at Camegie-Mellon University that allowed me to complete a large part 
of the manuscript. Finally, I want to acknowledge the encouragement and help I 
received from Simon Capelin and the staff of Cambridge University Press. 

Pittsburgh, Pennsylvania Robert B Griffiths 

March 2001 
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Introduction 


1.1 Scope of this book 

Quantum mechanics is a difficult subject, and this book is intended to help the 
reader overcome the main difficulties in the way to understanding it. The first part 
of the book, Chs. 2-16, contains a systematic presentation of the basic principles of 
quantum theory, along with a number of examples which illustrate how these prin- 
ciples apply to particular quantum systems. The applications are, for the most part, 
limited to toy models whose simple structure allows one to see what is going on 
without using complicated mathematics or lengthy formulas. The principles them- 
selves, however, are formulated in such a way that they can be applied to (almost) 
any nonrelativistic quantum system. In the second part of the book, Chs. 17-25, 
these principles are applied to quantum measurements and various quantum para- 
doxes, subjects which give rise to serious conceptual problems when they are not 
treated in a fully consistent manner. 

The final chapters are of a somewhat different character. Chapter 26 on deco- 
herence and the classical limit of quantum theory is a very sketchy introduction 
to these important topics along with some indication as to how the basic princi- 
ples presented in the first part of the book can be used for understanding them. 
Chapter 27 on quantum theory and reality belongs to the interface between physics 
and philosophy and indicates why quantum theory is compatible with a real world 
whose existence is not dependent on what scientists think and believe, or the ex- 
periments they choose to carry out. The Bibliography contains references for those 
interested in further reading or in tracing the origin of some of the ideas presented 
in earlier chapters. 

The remaining sections of this chapter provide a brief overview of the material 
in Chs. 2-25. While it may not be completely intelligible in advance of reading 
the actual material, the overview should nonetheless be of some assistance to read- 
ers who, like me, want to see something of the big picture before plunging into 
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the details. Section 1.2 concerns quantum systems at a single time, and Sec. 1.3 
their time development. Sections 1 .4 and 1 .5 indicate what topics in mathematics 
are essential for understanding quantum theory, and where the relevant material is 
located in this book, in case the reader is not already familiar with it. Quantum 
reasoning as it is developed in the first sixteen chapters is surveyed in Sec. 1.6. 
Section 1.7 concerns quantum measurements, treated in Chs. 17 and 18. Finally, 
Sec. 1.8 indicates the motivation behind the chapters, 19-25, devoted to quantum 
paradoxes. 


1.2 Quantum states and variables 

Both classical and quantum mechanics describe how physical objects move as a 
function of time. However, they do this using rather different mathematical struc- 
tures. In classical mechanics the state of a system at a given time is represented by a 
point in a phase space. For example, for a single particle moving in one dimension 
the phase space is the x, p plane consisting of pairs of numbers (x, p) representing 
the position and momentum. In quantum mechanics, on the other hand, the state of 
such a particle is given by a complex- valued wave function f(x), and, as noted in 
Ch. 2, the collection of all possible wave functions is a complex linear vector space 
with an inner product, known as a Hilbert space. 

The physical significance of wave functions is discussed in Ch. 2. Of particular 
importance is the fact that two wave functions fix) and fix) represent distinct 
physical states in a sense corresponding to distinct points in the classical phase 
space if and only if they are orthogonal in the sense that their inner product is 
zero. Otherwise f (x) and \fi (x) represent incompatible states of the quantum sys- 
tem (unless they are multiples of each other, in which case they represent the same 
state). Incompatible states cannot be compared with one another, and this relation- 
ship has no direct analog in classical physics. Understanding what incompatibility 
does and does not mean is essential if one is to have a clear grasp of the principles 
of quantum theory. 

A quantum property, Ch. 4, is the analog of a collection of points in a clas- 
sical phase space, and corresponds to a subspace of the quantum Hilbert space, 
or the projector onto this subspace. An example of a (classical or quantum) 
property is the statement that the energy E of a physical system lies within some 
specific range, Eq < E < E\. Classical properties can be subjected to various 
logical operations: negation, conjunction (AND), and disjunction (OR). The same 
is true of quantum properties as long as the projectors for the corresponding sub- 
spaces commute with each other. If they do not, the properties are incompatible 
in much the same way as nonorthogonal wave functions, a situation discussed in 
Sec. 4.6. 
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An orthonormal basis of a Hilbert space or, more generally, a decomposition of 
the identity as a sum of mutually commuting projectors constitutes a sample space 
of mutually-exclusive possibilities, one and only one of which can be a correct de- 
scription of a quantum system at a given time. This is the quantum counterpart 
of a sample space in ordinary probability theory, as noted in Ch. 5, which dis- 
cusses how probabilities can be assigned to quantum systems. An important differ- 
ence between classical and quantum physics is that quantum sample spaces can be 
mutually incompatible, and probability distributions associated with incompatible 
spaces cannot be combined or compared in any meaningful way. 

In classical mechanics a physical variable, such as energy or momentum, corre- 
sponds to a real-valued function defined on the phase space, whereas in quantum 
mechanics, as explained in Sec. 5.5, it is represented by a Hermitian operator. Such 
an operator can be thought of as a real-valued function defined on a particular sam- 
ple space, or decomposition of the identity, but not on the entire Hilbert space. 
In particular, a quantum system can be said to have a value (or at least a precise 
value) of a physical variable represented by the operator F if and only if the quan- 
tum wave function is in an eigenstate of F, and in this case the eigenvalue is the 
value of the physical variable. Two physical variables whose operators do not com- 
mute correspond to incompatible sample spaces, and in general it is not possible to 
simultaneously assign values of both variables to a single quantum system. 


1.3 Quantum dynamics 

Both classical and quantum mechanics have dynamical laws which enable one to 
say something about the future (or past) state of a physical system if its state is 
known at a particular time. In classical mechanics the dynamical laws are deter- 
ministic: at any given time in the future there is a unique state which corresponds to 
a given initial state. As discussed in Ch. 7, the quantum analog of the deterministic 
dynamical law of classical mechanics is the (time-dependent) Schrodinger equa- 
tion. Given some wave function i// 0 at a time to, integration of this equation leads 
to a unique wave function at any other time t. At two times t and t' these 
uniquely defined wave functions are related by a unitary map or time development 
operator T(t', t ) on the Hilbert space. Consequently we say that integrating the 
Schrodinger equation leads to unitary time development. 

However, quantum mechanics also allows for a stochastic or probabilistic time 
development, analogous to tossing a coin or rolling a die several times in a row. 
In order to describe this in a systematic way, one needs the concept of a quan- 
tum history, introduced in Ch. 8: a sequence of quantum events (wave functions 
or subspaces of the Hilbert space) at successive times. A collection of mutually 
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exclusive histories forms a sample space or family of histories, where each history 
is associated with a projector on a history Hilbert space. 

The successive events of a history are, in general, not related to one another 
through the Schrodinger equation. However, the Schrodinger equation, or, equiva- 
lently, the time development operators T ( t ' , t), can be used to assign probabilities 
to the different histories belonging to a particular family. For histories involving 
only two times, an initial time and a single later time, probabilities can be assigned 
using the Born rule, as explained in Ch. 9. However, if three or more times are 
involved, the procedure is a bit more complicated, and probabilities can only be 
assigned in a consistent way when certain consistency conditions are satisfied, as 
explained in Ch. 10. When the consistency conditions hold, the corresponding 
sample space or event algebra is known as a consistent family of histories, or a 
framework. Checking consistency conditions is not a trivial task, but it is made 
easier by various rules and other considerations discussed in Ch. 11. Chapters 9, 
10, 12, and 13 contain a number of simple examples which illustrate how the proba- 
bility assignments in a consistent family lead to physically reasonable results when 
one pays attention to the requirement that stochastic time development must be 
described using a single consistent family or framework, and results from incom- 
patible families, as defined in Sec. 10.4, are not combined. 


1.4 Mathematics I. Linear algebra 

Several branches of mathematics are important for quantum theory, but of these 
the most essential is linear algebra. It is the fundamental mathematical language 
of quantum mechanics in much the same way that calculus is the fundamental 
mathematical language of classical mechanics. One cannot even define essential 
quantum concepts without referring to the quantum Hilbert space, a complex linear 
vector space equipped with an inner product. Hence a good grasp of what quantum 
mechanics is all about, not to mention applying it to various physical problems, 
requires some familiarity with the properties of Hilbert spaces. 

Unfortunately, the wave functions for even such a simple system as a quan- 
tum particle in one dimension form an infinite-dimensional Hilbert space, and the 
rules for dealing with such spaces with mathematical precision, found in books on 
functional analysis, are rather complicated and involve concepts, such as Lebesgue 
integrals, which fall outside the mathematical training of the majority of physicists. 
Fortunately, one does not have to learn functional analysis in order to understand 
the basic principles of quantum theory. The majority of the illustrations used in 
Chs. 2-16 are toy models with a finite-dimensional Hilbert space to which the 
usual rules of linear algebra apply without any qualification, and for these mod- 
els there are no mathematical subtleties to add to the conceptual difficulties of 
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quantum theory. To be sure, mathematical simplicity is achieved at a certain cost, 
as toy models are even less “realistic” than the already artificial one-dimensional 
models one finds in textbooks. Nevertheless, they provide many useful insights 
into general quantum principles. 

For the benefit of readers not already familiar with them, the concepts of linear 
algebra in finite-dimensional spaces which are most essential to quantum theory 
are summarized in Ch. 3, though some additional material is presented later: ten- 
sor products in Ch. 6 and unitary operators in Sec. 7.2. Dirac notation, in which 
elements of the Hilbert space are denoted by 10), and their duals by (01, the in- 
ner product (<£| VO is linear in the element on the right and antilinear in the one 
on the left, and matrix elements of an operator A take the form (0|A|0), is used 
throughout the book. Dirac notation is widely used and universally understood 
among quantum physicists, so any serious student of the subject will find learn- 
ing it well-worthwhile. Anyone already familiar with linear algebra will have no 
trouble picking up the essentials of Dirac notation by glancing through Ch. 3. 

It would be much too restrictive and also rather artificial to exclude from this 
book all references to quantum systems with an infinite-dimensional Hilbert space. 
As far as possible, quantum principles are stated in a form in which they apply to 
infinite- as well as to finite-dimensional spaces, or at least can be applied to the 
former given reasonable qualifications which mathematically sophisticated readers 
can fill in for themselves. Readers not in this category should simply follow the 
example of the majority of quantum physicists: go ahead and use the rules you 
learned for finite-dimensional spaces, and if you get into difficulty with an infinite- 
dimensional problem, go talk to an expert, or consult one of the books indicated in 
the bibliography (under the heading of Ch. 3). 


1.5 Mathematics II. Calculus, probability theory 

It is obvious that calculus plays an essential role in quantum mechanics; e.g., the 
inner product on a Hilbert space of wave functions is defined in terms of an inte- 
gral, and the time-dependent Schrodinger equation is a partial differential equation. 
Indeed, the problem of constructing explicit solutions as a function of time to the 
Schrodinger equation is one of the things which makes quantum mechanics more 
difficult than classical mechanics. For example, describing the motion of a classi- 
cal particle in one dimension in the absence of any forces is trivial, while the time 
development of a quantum wave packet is not at all simple. 

Since this book focuses on conceptual rather than mathematical difficulties of 
quantum theory, considerable use is made of toy models with a simple discretized 
time dependence, as indicated in Sec. 7.4, and employed later in Chs. 9, 12, and 
13. To obtain their unitary time development, one only needs to solve a simple 
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difference equation, and this can be done in closed form on the back of an envelope. 
Because there is no need for approximation methods or numerical solutions, these 
toy models can provide a lot of insight into the structure of quantum theory, and 
once one sees how to use them, they can be a valuable guide in discerning what are 
the really essential elements in the much more complicated mathematical structures 
needed in more realistic applications of quantum theory. 

Probability theory plays an important role in discussions of the time develop- 
ment of quantum systems. However, the more sophisticated parts of this discipline, 
those that involve measure theory, are not essential for understanding basic quan- 
tum concepts, although they arise in various applications of quantum theory. In 
particular, when using toy models the simplest version of probability theory, based 
on a finite discrete sample space, is perfectly adequate. And once the basic strategy 
for using probabilities in quantum theory has been understood, there is no partic- 
ular difficulty — or at least no greater difficulty than one encounters in classical 
physics — in extending it to probabilities of continuous variables, as in the case of 
| i/s (x)\ 2 for a wave function 

In order to make this book self-contained, the main concepts of probability the- 
ory needed for quantum mechanics are summarized in Ch. 5, where it is shown 
how to apply them to a quantum system at a single time. Assigning probabilities 
to quantum histories is the subject of Chs. 9 and 10. It is important to note that 
the basic concepts of probability theory are the same in quantum mechanics as in 
other branches of physics; one does not need a new “quantum probability”. What 
distinguishes quantum from classical physics is the issue of choosing a suitable 
sample space with its associated event algebra. There are always many different 
ways of choosing a quantum sample space, and different sample spaces will often 
be incompatible, meaning that results cannot be combined or compared. However, 
in any single quantum sample space the ordinary rules for probabilistic reasoning 
are valid. 

Probabilities in the quantum context are sometimes discussed in terms of a den- 
sity matrix, a type of operator defined in Sec. 3.9. Although density matrices are 
not really essential for understanding the basic principles of quantum theory, they 
occur rather often in applications, and Ch. 15 discusses their physical significance 
and some of the ways in which they are used. 


1.6 Quantum reasoning 

The Hilbert space used in quantum mechanics is in certain respects quite dif- 
ferent from a classical phase space, and this difference requires that one make 
some changes in classical habits of thought when reasoning about a quantum sys- 
tem. What is at stake becomes particularly clear when one considers the two- 
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dimensional Hilbert space of a spin-half particle, Sec. 4.6, for which it is easy to 
see that a straightforward use of ideas which work very well for a classical phase 
space will lead to contradictions. Thinking carefully about this example is well- 
worthwhile, for if one cannot understand the simplest of all quantum systems, one 
is not likely to make much progress with more complicated situations. One ap- 
proach to the problem is to change the rules of ordinary (classical) logic, and this 
was the route taken by Birkhoff and von Neumann when they proposed a special 
quantum logic. However, their proposal has not been particularly fruitful for re- 
solving the conceptual difficulties of quantum theory. 

The alternative approach adopted in this book, starting in Sec. 4.6 and sum- 
marized in Ch. 16, leaves the ordinary rules of propositional logic unchanged, but 
imposes conditions on what constitutes a meaningful quantum description to which 
these rules can be applied. In particular, it is never meaningful to combine incom- 
patible elements — be they wave functions, sample spaces, or consistent families 
— into a single description. This prohibition is embodied in the single-framework 
rule stated in Sec. 16.1, but already employed in various examples in earlier chap- 
ters. 

Because so many mutually incompatible frameworks are available, the strategy 
used for describing the stochastic time development of a quantum system is quite 
different from that employed in classical mechanics. In the classical case, if one 
is given an initial state, it is only necessary to integrate the deterministic equations 
of motion in order to obtain a unique result at any later time. By contrast, an 
initial quantum state does not single out a particular framework, or sample space 
of stochastic histories, much less determine which history in the framework will 
actually occur. To understand how frameworks are chosen in the quantum case, 
and why, despite the multiplicity of possible frameworks, the theory still leads to 
consistent and coherent physical results, it is best to look at specific examples, of 
which a number will be found in Chs. 9, 10, 12, and 13. 

Another aspect of incompatibility comes to light when one considers a tensor 
product of Hilbert spaces representing the subsystems of a composite system, or 
events at different times in the history of a single system. This is the notion of a 
contextual or dependent property or event. Chapter 14 is devoted to a systematic 
discussion of this topic, which also comes up in several of the quantum paradoxes 
considered in Chs. 20-25. 

The basic principles of quantum reasoning are summarized in Ch. 16 and shown 
to be internally consistent. This chapter also contains a discussion of the intuitive 
significance of multiple incompatible frameworks, one of the most significant ways 
in which quantum theory differs from classical physics. If the principles stated in 
Ch. 16 seem rather abstract, readers should work through some of the examples 
found in earlier or later chapters or, better yet, work out some for themselves. 



Introduction 


1.7 Quantum measurements 

A quantum theory of measurements is a necessary part of any consistent way of 
understanding quantum theory for a fairly obvious reason. The phenomena which 
are specific to quantum theory, which lack any description in classical physics, 
have to do with the behavior of microscopic objects, the sorts of things which 
human beings cannot observe directly. Instead we must use carefully constructed 
instruments to amplify microscopic effects into macroscopic signals of the sort 
we can see with our eyes, or feed into our computers. Unless we understand how 
the apparatus works, we cannot interpret its macroscopic output in terms of the 
microscopic quantum phenomena we are interested in. 

The situation is in some ways analogous to the problem faced by astronomers 
who depend upon powerful telescopes in order to study distant galaxies. If they 
did not understand how a telescope functions, cosmology would be reduced to 
pure speculation. There is, however, an important difference between the “tele- 
scope problem” of the astronomer and the “measurement problem” of the quan- 
tum physicist. No fundamental concepts from astronomy are needed in order to 
understand the operation of a telescope: the principles of optics are, fortunately, 
independent of the properties of the object which emits the light. But a piece of 
laboratory apparatus capable of amplifying quantum effects, such as a spark cham- 
ber, is itself composed of an enormous number of atoms, and nowadays we believe 
(and there is certainly no evidence to the contrary) that the behavior of aggregates 
of atoms as well as individual atoms is governed by quantum laws. Thus quan- 
tum measurements can, at least in principle, be analyzed using quantum theory. If 
for some reason such an analysis were impossible, it would indicate that quantum 
theory was wrong, or at least seriously defective. 

Measurements as parts of gedanken experiments played a very important role 
in the early development of quantum theory. In particular, Bohr was able to meet 
many of Einstein’s objections to the new theory by pointing out that quantum prin- 
ciples had to be applied to the measuring apparatus itself, as well as to the particle 
or other microscopic system of interest. A little later the notion of measurement 
was incorporated as a fundamental principle in the standard interpretation of quan- 
tum mechanics, accepted by the majority of quantum physicists, where it served 
as a device for introducing stochastic time development into the theory. As von 
Neumann explained it, a system develops unitarily in time, in accordance with 
Schrodinger’s equation, until it interacts with some sort of measuring apparatus, 
at which point its wave function undergoes a “collapse” or “reduction” correlated 
with the outcome of the measurement. 

However, employing measurements as a fundamental principle for interpreting 
quantum theory is not very satisfactory. Nowadays quantum mechanics is applied 
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to processes taking place at the centers of stars, to the decay of unstable particles 
in intergalactic space, and in many other situations which can scarcely be thought 
of as involving measurements. In addition, laboratory measurements are often of 
a sort in which the measured particle is either destroyed or else its properties are 
significantly altered by the measuring process, and the von Neumann scheme does 
not provide a satisfactory connection between the measurement outcome (e.g., a 
pointer position) and the corresponding property of the particle before the mea- 
surement took place. Numerous attempts have been made to construct a fully con- 
sistent measurement-based interpretation of quantum mechanics, thus far without 
success. Instead, this approach leads to a number of conceptual difficulties which 
constitute what specialists refer to as the “measurement problem.” 

In this book all of the fundamental principles of quantum theory are developed, 
in Chs. 2-16, without making any reference to measurements, though measure- 
ments occur in some of the applications. Measurements are taken up in Chs. 17 
and 18, and analyzed using the general principles of quantum mechanics intro- 
duced earlier. This includes such topics as how to describe a macroscopic mea- 
suring apparatus in quantum terms, the role of thermodynamic irreversibility in the 
measurement process, and what happens when two measurements are carried out in 
succession. The result is a consistent theory of quantum measurements based upon 
fundamental quantum principles, one which is able to reproduce all the results of 
the von Neumann approach and to go beyond it; e.g., by showing how the outcome 
of a measurement is correlated with some property of the measured system before 
the measurement took place. 

Wave function collapse or reduction, discussed in Sec. 18.2, is not needed for a 
consistent quantum theory of measurement, as its role is taken over by a suitable 
use of conditional probabilities. To put the matter in a different way, wave function 
collapse is one method for computing conditional probabilities that can be obtained 
equally well using other methods. Various conceptual difficulties disappear when 
one realizes that collapse is something which takes place in the theoretical physi- 
cist’s notebook and not in the experimental physicist’s laboratory. In particular, 
there is no physical process taking place instantaneously over a long distance, in 
conflict with relativity theory. 


1.8 Quantum paradoxes 

A large number of quantum paradoxes have come to light since the modem form 
of quantum mechanics was first developed in the 1920s. A paradox is something 
which is contradictory, or contrary to common sense, but which seems to follow 
from accepted principles by ordinary logical rules. That is, it is something which 
ought to be true, but seemingly is not true. A scientific paradox may indicate that 
there is something wrong with the underlying scientific theory, which is quantum 
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mechanics in the case of interest to us. But a paradox can also be a prediction 
of the theory that, while rather surprising when one first hears it, is shown by 
further study or deeper analysis to reflect some genuine feature of the universe 
in which we live. For example, in relativity theory we learn that it is impossible 
for a signal to travel faster than the speed of light. This seems paradoxical in 
that one can imagine being on a rocket ship traveling at half the speed of light, 
and then shining a flashlight in the forwards direction. However, this (apparent) 
paradox can be satisfactorily explained by making consistent use of the principles 
of relativity theory, in particular those which govern transformations to moving 
coordinate systems. 

A consistent understanding of quantum mechanics should make it possible to 
resolve quantum paradoxes by locating the points where they involve hidden as- 
sumptions or flawed reasoning, or by showing how the paradox embodies some 
genuine feature of the quantum world which is surprising from the perspective of 
classical physics. The formulation of quantum theory found in the first sixteen 
chapters of this book is employed in Chs. 20-25 to resolve a number of quantum 
paradoxes, including delayed choice, Kochen-S pecker, EPR, and Hardy’s paradox, 
among others. (Schrodinger’s cat and the double-slit paradox, or at least their toy 
counterparts, are taken up earlier in the book, in Secs. 9.6 and 13.1, respectively, 
as part of the discussion of basic quantum principles.) Chapter 19 provides a brief 
introduction to these paradoxes along with two conceptual tools, quantum coins 
and quantum counterfactuals, which are needed for analyzing them. 

In addition to demonstrating the overall consistency of quantum theory, there 
are at least three other reasons for devoting a substantial amount of space to these 
paradoxes. The first is that they provide useful and interesting examples of how 
to apply the basic principles of quantum mechanics. Second, various quantum 
paradoxes have been invoked in support of the claim that quantum theory is in- 
trinsically nonlocal in the sense that there are mysterious influences which can, to 
take an example, instantly communicate the choice to carry out one measurement 
rather than another at point A to a distant point B, in a manner which contradicts 
the basic requirements of relativity theory. A careful analysis of these paradoxes 
shows, however, that the apparent contradictions arise from a failure to properly 
apply some principle of quantum reasoning in a purely local setting. Nonlocal in- 
fluences are generated by logical mistakes, and when the latter are corrected, the 
ghosts of nonlocality vanish. Third, these paradoxes have sometimes been used to 
argue that the quantum world is not real, but is in some way created by human con- 
sciousness, or else that reality is a concept which only applies to the macroscopic 
domain immediately accessible to human experience. Resolving the paradoxes, in 
the sense of showing them to be in accord with consistent quantum principles, is 
thus a prelude to the discussion of quantum reality in Ch. 27. 
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2.1 Classical and quantum particles 

In classical Hamiltonian mechanics the state of a particle at a given instant of 
time is given by two vectors: r = (x,y,z) representing its position, and p = 
(p x , p y , p z ) representing its momentum. One can think of these two vectors to- 
gether as determining a point in a six-dimensional phase space. As time increases 
the point representing the state of the particle traces out an orbit in the phase space. 
To simplify the discussion, consider a particle which moves in only one dimen- 
sion, with position x and momentum p. Its phase space is the two-dimensional 
x, p plane. If, for example, one is considering a harmonic oscillator with angular 
frequency &>, the orbit of a particle of mass m will be an ellipse of the form 

x — A sin(a>t + (p), p = mAa>cos(cot + (p) (2.1) 

for some amplitude A and phase <p, as shown in Fig. 2. 1 . 

A quantum particle at a single instant of time is described by a wave function 
fir). a complex function of position r. Again in the interests of simplicity we 
will consider a quantum particle moving in one dimension, so that its wave func- 
tion fix) depends on only a single variable, the position x. Some examples of 
real- valued wave functions, which can be sketched as simple graphs, are shown in 
Figs. 2.2-2.4. It is important to note that all of the information required to describe 
a quantum state is contained in the function fix). Thus this one function is the 
quantum analog of the pair of real numbers x and p used to describe a classical 
particle at a particular time. 

In order to understand the physical significance of quantum wave functions, one 
needs to know that they belong to a linear vector space Ji. That is, if f ix) and 
fix') are any two wave functions belonging to H, the linear combination 

co(x) — af(x) + ff(x), (2.2) 

where a and f> are any two complex numbers, also belongs to H. The space H is 
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Fig. 2.1. Phase space x, p for a particle in one dimension. The ellipse is a possible orbit 
for a harmonic oscillator. The cross-hatched region corresponds to x\ < x < xj. 


equipped with an inner product which assigns to any two wave functions x/r (x) and 
(p(x) the complex number 


(M) = 



4>*(x)x/r(x) dx. 


(2.3) 


Here <p*(x) denotes the complex conjugate of the function <fi(x). (The notation 
used in (2.3) is standard among physicists, and differs in some trivial but annoying 
details from that generally employed by mathematicians.) 

The inner product (0| xjr) is analogous to the dot product 


a • b = a x b x + a y b y + a z b z (2.4) 

of two ordinary vectors a and b. One difference is that a dot product is always a 
real number, and a • b is the same as b • a. By contrast, the inner product defined 
in (2.3) is in general a complex number, and interchanging xjr (x) with (p(x) yields 
the complex conjugate: 


m) = (M)*. (2.5) 

Despite this difference, the analogy between a dot product and an inner product is 
useful in that it provides an intuitive geometrical picture of the latter. 

If (0| Vr) = 0, which in view of (2.5) is equivalent to (0 10) = 0, the func- 
tions xjr{x) and <t>(x) are said to be orthogonal to each other. This is analogous to 
a • b = 0, which means that a and b are perpendicular to each other. The concept 
of orthogonal (“perpendicular”) wave functions, along with certain generalizations 
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of this notion, plays an extremely important role in the physical interpretation of 
quantum states. The inner product of ft (x) with itself, 


IW| 2 = 



f*ix)ffx)dx, 


(2.6) 


is a positive number whose (positive) square root \\f 1| is called the norm of fix). 
The integral must be less than infinity for a wave function to be a member of H. 
Thus e~ ax for a > 0 is a member of If, whereas e~ ax ~ is not. 

A complex linear space If with an inner product is known as a Hilbert space 
provided it satisfies some additional conditions which are discussed in texts on 
functional analysis and mathematical physics, but lie outside the scope of this book 
(see the remarks in Sec. 1.4). Because of the condition that the norm as defined 
in (2.6) be finite, the linear space of wave functions is called the Hilbert space of 
square-integrable functions, often denoted by L 2 . 


2,2 Physical interpretation of the wave function 

The intuitive significance of the pair of numbers x , p used to describe a classical 
particle in one dimension at a particular time is relatively clear: the particle is 
located at the point x, and its velocity is p/m. The interpretation of a quantum 
wave function fix), on the other hand, is much more complicated, and an intuition 
for what it means has to be built up by thinking about various examples. We will 
begin this process in Sec. 2.3. However, it is convenient at this point to make 
some very general observations, comparing and contrasting quantum with classical 
descriptions. 

Any point x, p in the classical phase space represents a possible state of the 
classical particle. In a similar way, almost every wave function in the space 7 f 
represents a possible state of a quantum particle. The exception is the state fix) 
which is equal to 0 for every value of x, and thus has norm j|i/r|| = 0. This is 
an element of the linear space, and from a mathematical point of view it is a very 
significant element. Nevertheless, it cannot represent a possible state of a physical 
system. All the other members of If represent possible quantum states. 

A point in the phase space represents the most precise description one can have 
of the state of a classical particle. If one knows both x and p for a particle in one 
dimension, that is all there is to know. In the same way, the quantum wave func- 
tion fix) represents a complete description of a quantum particle, there is nothing 
more that can be said about it. To be sure, a classical “particle” might possess some 
sort of internal structure and in such a case the pair x, p, or r, p, would represent 
the position of the center of mass and the total momentum, respectively, and one 
would need additional variables in order to describe the internal degrees of free- 
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dom. Similarly, a quantum particle can possess an internal structure, in which case 
t/r(x) or \[/(r) provides a complete description of the center of mass, whereas rjj 
must also depend upon additional variables if it is to describe the internal structure 
as well as the center of mass. The quantum description of particles with internal 
degrees of freedom, and of collections of several particles is taken up in Ch. 6. 

An important difference between the classical phase space and the quantum 
Hilbert space H has to do with the issue of whether elements which are mathe- 
matically distinct describe situations which are physically distinct. Let us begin 
with the classical case, which is relatively straightforward. Two states (x, p) and 
(x', p') represent the same physical state if and only if 

x' — x, p' — p, (2.7) 

that is, if the two points in phase space coincide with each other. Otherwise they 
represent mutually-exclusive possibilities: a particle cannot be in two different 
places at the same time, nor can it have two different values of momentum (or 
velocity) at the same time. To summarize, two states of a classical particle have 
the same physical interpretation if and only if they have the same mathematical 
description. 


t 



Fig. 2.2. Three wave functions which have the same physical meaning. 

The case of a quantum particle is not nearly so simple. There are three different 
situations one needs to consider. 

1. If two functions i//(x) and 4>{x) are multiples of each other, that is, 4>(x) — 
ax/s(x) for some nonzero complex number a, then these two functions have pre- 
cisely the same physical meaning. For example, all three functions in Fig. 2.2 have 
the same physical meaning. This is in marked contrast to the waves one is familiar 
with in classical physics, such as sound waves, or waves on the surface of water. 
Increasing the amplitude of a sound wave by a factor of 2 means that it carries four 
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times as much energy, whereas multiplying a quantum wave function by 2 leaves 
its physical significance unchanged. 

Given any fix) with positive norm, it is always possible to introduce another 
function 

fix) = fix)f\\f\\ ( 2 - 8 ) 

which has the same physical meaning as fix), but whose norm is ||i^|| = 1. Such 
normalized states are convenient when carrying out calculations, and for this reason 
quantum physicists often develop a habit of writing wave functions in normalized 
form, even when it is not really necessary. A normalized wave function remains 
normalized when it is multiplied by a complex constant e l<l> , where the phase f is 
some real number, and of course its physical meaning is not changed. Thus a nor- 
malized wave function representing some physical situation still has an arbitrary 
phase. 

Warning! Although multiplying a wave function by a nonzero scalar does not 
change its physical significance, there are cases in which a careless use of this 
principle can lead to mistakes. Suppose that one is interested in a wave function 
which is a linear combination of two other functions, 

fix) = f(x) + (o(x). (2.9) 

Multiplying fix) but not co(x) by a complex constant a leads to a function 

fix) — afix) + (o(x) ( 2 . 10 ) 

which does not, at least in general, have the same physical meaning as fix), be- 
cause it is not equal to a constant times fix). 

2. Two wave functions fix) and fix) which are orthogonal to each other, 
(f\f) = 0 , represent mutually exclusive physical states: if one of them is true, 
in the sense that it is a correct description of the quantum system, the other is false, 
that is, an incorrect description of the quantum system. For example, the inner 
product of the two wave functions fix) and fix) sketched in Fig. 2.3 is zero, be- 
cause at any x where one of them is finite, the other is zero, and thus the integrand 
in (2.3) is zero. As discussed in Sec. 2.3, if a wave function vanishes outside some 
finite interval, the quantum particle is located inside that interval. Since the two 
intervals [jti, xf\ and [X 3 , x 4 1 in Fig. 2.3 do not overlap, they represent mutually- 
exclusive possibilities: if the particle is in one interval, it cannot be in the other. 

In Fig. 2.4, fix) and fix) are the ground state and first excited state of a quan- 
tum particle in a smooth, symmetrical potential well (such as a harmonic oscilla- 
tor). In this case the vanishing of (f\f) is not quite so obvious, but it follows from 
the fact that fix) is an even and fix) an odd function of x. Thus their product 
is an odd function of x, and the integral in (2.3) vanishes. From a physical point 
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Fig. 2.3. Two orthogonal wave functions. 



Fig. 2.4. Two orthogonal wave functions. 


of view these two states are mutually-exclusive possibilities because if a quantum 
particle has a definite energy, it cannot have some other energy. 

3. If (fix) and xjj(x) are not multiples of each other, and (4>\^) is not equal to 
zero, the two wave functions represent incompatible states-of-affairs, a relationship 
which will be discussed in Sec. 4.6. Figure 2.5 shows a pair of incompatible wave 
functions. It is obvious that <p(x) cannot be a multiple of r/z ( jc), because there are 
values of x at which <fi is positive and x/s is zero. On the other hand, it is also obvious 
that the inner product (<f> \x/s) is not zero, for the integrand in (2.3) is positive, and 
nonzero over a finite interval. 

There is nothing in classical physics corresponding to descriptions which are 
incompatible in the quantum sense of the term. This is one of the main reasons 
why quantum theory is hard to understand: there is no good classical analogy for 
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Fig. 2.5. Two incompatible wave functions. 


the situation shown in Fig. 2.5. Instead, one has to build up one’s physical intuition 
for this situation using examples that are quantum mechanical. It is important to 
keep in mind that quantum states which are incompatible stand in a very different 
relationship to each other than states which are mutually exclusive; one must not 
confuse these two concepts ! 


2.3 Wave functions and position 

The quantum wave function fix) is a function of x, and in classical physics x is 
simply the position of the particle. But what can one say about the position of a 
quantum particle described by i /r(x)? In classical physics wave packets are used 
to describe water waves, sound waves, radar pulses, and the like. In each of these 
cases the wave packet does not have a precise position; indeed, one would not 
recognize something as a wave if it were not spread out to some extent. Thus there 
is no reason to suppose that a quantum particle possesses a precise position if it is 
described by a wave function f(x), since the wave packet itself, thought of as a 
mathematical object, is obviously not located at a precise position x. 

In addition to waves, there are many objects, such as clouds and cities, which do 
not have a precise location. These, however, are made up of other objects whose 
location is more definite: individual water droplets in a cloud, or individual build- 
ings in a city. However, in the case of a quantum wave packet, a more detailed 
description in terms of smaller (better localized) physical objects or properties is 
not possible. To be sure, there is a very localized mathematical description: at 
each x the wave packet takes on some precise value (x). But there is no reason to 
suppose that this represents a corresponding physical “something” located at this 
precise point. Indeed, the discussion in Sec. 2.2 suggests quite the opposite. To 
begin with, the value of f (xq) at a particular point xo cannot in any direct way 
represent the value of some physical quantity, since one can always multiply the 
function (x) by a complex constant to obtain another wave function with the same 
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physical significance, thus altering \jr(x o) in an arbitrary fashion (unless, of course, 
\js (xo) = 0). Furthermore, in order to see that the mathematically distinct wave 
functions in Fig. 2.2 represent the same physical state of affairs, and that the two 
functions in Fig. 2.4 represent distinct physical states, one cannot simply carry out 
a point-by-point comparison; instead it is necessary to consider each wave function 
“as a whole”. 

It is probably best to think of a quantum particle as delocalized, that is, as not 
having a position which is more precise than that of the wave function representing 
its quantum state. The term “delocalized” should be understood as meaning that no 
precise position can be defined, and not as suggesting that a quantum particle is in 
two different places at the same time. Indeed, we shall show in Sec. 4.5, there is a 
well-defined sense in which a quantum particle cannot be in two (or more) places 
at the same time. 

Things which do not have precise positions, such as books and tables, can 
nonetheless often be assigned approximate locations, and it is often useful to do 
so. The situation with quantum particles is similar. There are two different, though 
related, approaches to assigning an approximate position to a quantum particle in 
one dimension (with obvious generalizations to higher dimensions). The first is 
mathematically quite “clean”, but can only be applied for a rather limited set of 
wave functions. The second is mathematically “sloppy”, but is often of more use 
to the physicist. Both of them are worth discussing, since each adds to one’s phys- 
ical understanding of the meaning of a wave function. 

It is sometimes the case, as in the examples in Figs. 2.2, 2.3, and 2.5, that the 
quantum wave function is nonzero only in some finite interval 

x\ < x < x%, (2.11) 

In such a case it is safe to assert that the quantum particle is not located outside 
this interval, or, equivalently, that it is inside this interval, provided the latter is not 
interpreted to mean that there is some precise point inside the interval where the 
particle is located. In the case of a classical particle, the statement that it is not 
outside, and therefore inside the interval (2.11) corresponds to asserting that the 
point x, p representing the state of the particle falls somewhere inside the region 
of its phase space indicated by the cross-hatching in Fig. 2.1. To be sure, since 
the actual position of a classical particle must correspond to a single number x, we 
know that if it is inside the interval (2.11), then it is actually located at a definite 
point in this interval, even though we may not know what this precise point is. By 
contrast, in the case of any of the wave functions in Fig. 2.2 it is incorrect to say 
that the particle has a location which is more precise than is given by the interval 
(2. 11), because the wave packet cannot be located more precisely than this, and the 
particle cannot be located more precisely than its wave packet. 
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Fig. 2.6. Some of the many wave functions which vanish outside the interval x\ < x < X 2 - 


There is a quantum analog of the cross-hatched region of the phase space in 
Fig. 2.1: it is the collection of all wave functions in H with the property that they 
vanish outside the interval [jci, X 2 ]. There are, of course, a very large number of 
wave functions of this type, a few of which are indicated in Fig. 2.6. Given a wave 
function which vanishes outside (2.1 1), it still has this property if multiplied by an 
arbitrary complex number. And the sum of two wave functions of this type will 
also vanish outside the interval. Thus the collection of all functions which vanish 
outside [xi , X 2 ] is itself a linear space. If in addition we impose the condition 
that the allowable functions have a finite norm, the corresponding collection of 
functions X is part of the collection TL of all allowable wave functions, and because 
A is a linear space, it is a subspace of the quantum Hilbert space Tf . As we shall 
see in Ch. 4, a physical property of a quantum system can always be associated 
with a subspace of If, in the same way that a physical property of a classical system 
corresponds to a subset of points in its phase space. In the case at hand, the physical 
property of being located inside the interval [xi , X 2 ] corresponds in the classical 
case to the cross-hatched region in Fig. 2.1, and in the quantum case to the subspace 
X which has just been defined. 

The notion of approximate location discussed above has limited applicability, 
because one is often interested in wave functions which are never equal to zero, 
or at least do not vanish outside some finite interval. An example is the Gaussian 
wave packet 

fix) - exp[— (x - x 0 ) 2 /4(Ax) 2 ], (2.12) 

centered at x = xo, where Ax is a constant, with the dimensions of a length, that 
provides a measure of the width of the wave packet. The function f (x) is never 
equal to 0. However, when |x — xo| is large compared to Ax, fix) is very small, 
and so it seems sensible, at least to a physicist, to suppose that for this quantum 



20 


Wave functions 


state, the particle is located “near” xo, say within an interval 

xo — kAx < x < xo + kAx, (2.13) 

where k might be set equal to 1 when making a rough back-of-the-envelope calcu- 
lation, or perhaps 2 or 3 or more if one is trying to be more careful or conservative. 

What the physicist is, in effect, doing in such circumstances is approximating the 
Gaussian wave packet in (2.12) by a function which has been set equal to 0 for x 
lying outside the interval (2. 13). Once the “tails” of the Gaussian packet have been 
eliminated in this manner, one can employ the ideas discussed above for functions 
which vanish outside some finite interval. To be sure, “cutting off the tails” of the 
original wave function involves an approximation, and as with all approximations, 
this requires the application of some judgment as to whether or not one will be 
making a serious mistake, and this will in turn depend upon the sort of questions 
which are being addressed. Since approximations are employed in all branches of 
theoretical physics (apart from those which are indistinguishable from pure math- 
ematics), it would be quibbling to deny this possibility to the quantum physicist. 
Thus it makes physical sense to say that the wave packet (2.12) represents a quan- 
tum particle with an approximate location given by (2.13), as long as k is not too 
small. Of course, similar reasoning can be applied to other wave packets which 
have long tails. 

It is sometimes said that the meaning, or at least one of the meanings, of the 
wave function x[f(x) is that 

p(x) = I^WlVlIVi 2 (2.14) 

is a probability distribution density for the particle to be located at the position x, or 
found to be at the position x by a suitable measurement. Wave functions can indeed 
be used to calculate probability distributions, and in certain circumstances (2.14) is 
a correct way to do such a calculation. However, in quantum theory it is necessary 
to differentiate between xf/(x) as representing a physical property of a quantum 
system, and x/r(x) as a pre-probability, a mathematical device for calculating prob- 
abilities. It is necessary to look at examples to understand this distinction, and we 
shall do so in Ch. 9, following a general discussion of probabilities in quantum 
theory in Ch. 5. 


2.4 Wave functions and momentum 

The state of a classical particle in one dimension is specified by giving both x and 
p, while in the quantum case the wave function xj/(x) depends upon only one of 
these two variables. From this one might conclude that quantum theory has nothing 
to say about the momentum of a particle, but this is not correct. The information 
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about the momentum provided by quantum mechanics is contained in fix), but 
one has to know how to extract it. A convenient way to do so is to define the 
momentum wave function 


Up) = 



\ ]r(x)dx 


(2.15) 


as the Fourier transform of fix). 

Note that f(p) is completely determined by the position wave function fix). 
On the other hand, (2.15) can be inverted by writing 


Mx) = 



e +ipx/h 


it ip) dp. 


(2.16) 


so that, in turn, f(x) is completely determined by fip). Therefore fix) and f ( p) 
contain precisely the same information about a quantum state; they simply express 
this information in two different forms. Whatever may be the physical significance 
of fix), that of fip) is exactly the same. One can say that f ix) is the posi- 
tion representation and fip) the momentum representation of the single quantum 
state which describes the quantum particle at a particular instant of time. (As an 
analogy, think of a novel published simultaneously in two different languages: the 
two editions represent exactly the same story, assuming the translator has done a 
good job.) The inner product (2.3) can be expressed equally well using either the 
position or the momentum representation: 

f+OO p+oo 

(flit) = / f*(x)f(x)dx = / f*(p)f(p)dp. (2.17) 


Information about the momentum of a quantum particle can be obtained from the 
momentum wave function in the same way that information about its position can 
be obtained from the position wave function, as discussed in Sec. 2.3. A quantum 
particle, unlike a classical particle, does not possess a well-defined momentum. 
However, if f ip) vanishes outside an interval 


Pi < P < Pi, (2.18) 

it possesses an approximate momentum in that the momentum does not lie outside 
the interval (2.18); equivalently, the momentum lies inside this interval, though it 
does not have some particular precise value inside this interval. 

Even when f (p) does not vanish outside any interval of the form (2. 18), one can 
still assign an approximate momentum to the quantum particle in the same way that 
one can assign an approximate position when f ix) has nonzero tails, as in (2.12). 



22 


Wave functions 


In particular, in the case of a Gaussian wave packet 

Up) = exp[-(p - p 0 ) 2 /4(Ap) 2 l (2.19) 

it is reasonable to say that the momentum is “near” p 0 in the sense of lying in the 
interval 


po - kA p < p < po + kA p, (2.20) 

with k on the order of 1 or larger. The justification for this is that one is approx- 
imating (2.19) with a function which has been set equal to 0 outside the interval 
(2.20). Whether or not “cutting off the tails” in this manner is an acceptable ap- 
proximation is a matter of judgment, just as in the case of the position wave packet 
discussed in Sec. 2.3. 

The momentum wave function can be used to calculate a probability distribution 
density 

p(p) = \Up)\ 2 /\\n 2 ( 2 . 21 ) 

for the momentum p in much the same way as the position wave function can be 
used to calculate a similar density for x, (2.14). See the remarks following (2.14): 
it is important to distinguish between xjr(p) as representing a physical property, 
which is what we have been discussing, and as a pre-probability, which is its role 
in (2.21). If one sets xo = 0 in the Gaussian wave packet (2.12) and carries out 
the Fourier transform (2.15), the result is (2.19) with po — 0 and A p — h/ 2 Ax. 
As shown in introductory textbooks, it is quite generally the case that for any given 
quantum state 

Ap ■ Ax > h/2, (2.22) 

where (Ax) 2 is the variance of the probability distribution density (2.14), and 
(Ap) 2 the variance of the one in (2.21). Probabilities will be taken up later in 
the book, but for present purposes it suffices to regard Ax and Ap as convenient, 
albeit somewhat crude measures of the widths of the wave packets x/t (x) and xjr(p), 
respectively. What the inequality tells us is that the narrower the position wave 
packet xjs(x), the broader the corresponding momentum wave packet xjr(p) has got 
to be, and vice versa. 

The inequality (2.22) expresses the well-known Heisenberg uncertainty prin- 
ciple. This principle is often discussed in terms of measurements of a particle’s 
position or momentum, and the difficulty of simultaneously measuring both of 
these quantities. While such discussions are not without merit — and we shall 
have more to say about measurements later in this book — they tend to put the 
emphasis in the wrong place, suggesting that the inequality somehow arises out of 
peculiarities associated with measurements. But in fact (2.22) is a consequence of 
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the decision by quantum physicists to use a Hilbert space of wave packets in order 
to describe quantum particles, and to make the momentum wave packet for a par- 
ticular quantum state equal to the Fourier transform of the position wave packet for 
the same state. In the Hilbert space there are, as a fact of mathematics, no states for 
which the widths of the position and momentum wave packets violate the inequal- 
ity (2.22). Hence if this Hilbert space is appropriate for describing the real world, 
no particles exist for which the position and momentum can even be approximately 
defined with a precision better than that allowed by (2.22). If measurements can 
accurately determine the properties of quantum particles — another topic to which 
we shall later return — then the results cannot, of course, be more precise than the 
quantities which are being measured. To use an analogy, the fact that the location 
of the city of Pittsburgh is uncertain by several kilometers has nothing to do with 
the lack of precision of surveying instruments. Instead a city, as an extended object, 
does not have a precise location. 


2.5 Toy model 

The Hilbert space TL for a quantum particle in one dimension is extremely large; 
viewed as a linear space it is infinite-dimensional. Infinite-dimensional spaces pro- 
vide headaches for physicists and employment for mathematicians. Most of the 
conceptual issues in quantum theory have nothing to do with the fact that the 
Hilbert space is infinite-dimensional, and therefore it is useful, in order to sim- 
plify the mathematics, to replace the continuous variable x with a discrete variable 
m which takes on only a finite number of integer values. That is to say, we will 
assume that the quantum particle is located at one of a finite collection of sites ar- 
ranged in a straight line, or, if one prefers, it is located in one of a finite number of 
boxes or cells. It is often convenient to think of this system of sites as having “pe- 
riodic boundary conditions” or as placed on a circle, so that the last site is adjacent 
to (just in front of) the first site. If one were representing a wave function numer- 
ically on a computer, it would be sensible to employ a discretization of this type. 
However, our goal is not numerical computation, but physical insight. Temporarily 
shunting mathematical difficulties out of the way is part of a useful “divide and 
conquer” strategy for attacking difficult problems. Our aim will not be realistic de- 
scriptions, but instead simple descriptions which still contain the essential features 
of quantum theory. For this reason, the term “toy model” seems appropriate. 

Let us suppose that the quantum wave function is of the form fi(m), with m an 
integer in the range 

—M a <m< M b , (2.23) 


where M a and Mb are fixed integers, so m can take on M = M a + Mi, + 1 different 
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values. Such wave functions form an M-dimensional Hilbert space. For example, 
if M a — 1 = Mb, the particle can be at one of the three sites, m — — 1, 0, 1, and 
its wave function is completely specified by the M — 3 complex numbers xj/i— I ), 
r/r (0), and 0(1). The inner product of two wave functions is given by 

(01 if) = (2.24) 

where the sum is over those values of m allowed by (2.23), and the norm of 0 is 
the positive square root of 

||0|| 2 = £|0(m)| 2 . (2.25) 


The toy wave function Xn, defined by 


Xn(m ) - 8, 


1 if m — n, 

0 for m ^ n, 


(2.26) 


where 8 mn is the Kronecker delta function, has the physical significance that the 
particle is at site n (or in cell n). Now suppose that M a = 3 = M b , and consider 
the wave function 


0(m) - x-i (m) + 1.5xo(«i) + 


(2.27) 


It is sketched in Fig. 2.7, and one can think of it as a relatively coarse approximation 
to a continuous function of the sort shown in Fig. 2.2, with x\ = —2, X 2 — +2. 
What can one say about the location of the particle whose quantum wave function 
is given by (2.27)? 


• • * * 4 

-3 -2-10 1 


2 
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Fig. 2.7. The toy wave packet (2.27). 

In light of the discussion in Sec. 2.3 it seems sensible to interpret x/s(m) as signi- 
fying that the position of the quantum particle is not outside the interval [— 1, +1], 
where by [—1, +1] we mean the three values —1, 0, and +1. The circumlocution 
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“not outside the interval” can be replaced with the more natural “inside the inter- 
val” provided the latter is not interpreted to mean “at a particular site inside this 
interval”, since the particle described by (2.27) cannot be said to be at m = — 1 or 
at m — 0 or at m = 1 . Instead it is delocalized, and its position cannot be speci- 
fied any more precisely than by giving the interval [—1, +1]. There is no concise 
way of stating this in English, which is one reason we need a mathematical nota- 
tion in which quantum properties can be expressed in a precise way — this will be 
introduced in Ch. 4. 

It is important not to look at a wave function written out as a sum of different 
pieces whose physical significance one understands, and interpret it in physical 
terms as meaning the quantum system has one or the other of the properties cor- 
responding to the different pieces. In particular, one should not interpret (2.27) to 
mean that the particle is at m = — 1 or at m = 0 or at m = 1 . A simple exam- 
ple which illustrates how such an interpretation can lead one astray is obtained by 
writing xo in the form 

Xo(m) = (l/2)[xo(m) + ixiim)] + (l/2)[/ 0 (m) + (-i)X 2 (m)\. (2.28) 

If we carelessly interpret “+” to mean “or”, then both of the functions in square 
brackets on the right side of (2.28), and therefore also their sum, have the interpre- 
tation that the particle is at 0 or 2, whereas in fact xo(w) means that the particle is at 
0 and not at 2. The correct quantum mechanical way to use “or” will be discussed 
in Secs. 4.5, 4.6, and 5.2. 

Just as x//(m) is a discrete version of the position wave function yfr(x), there is 
also a discrete version rj/ (k) of the momentum wave function r} (p), given by the 
formula 

i(k) = -L e~ 2nikm/M ^(m), (2.29) 

v M m 

where k is an integer which can take on the same set of values as m, (2.23). The 
inverse transformation is 

ir(m) = (2.30) 

VM V 

The inner product of two states, (2.24), can equally well be written in terms of 
momentum wave functions: 

(4>m = (2.31) 

k 

These expressions are similar to those in (2.15)-(2.17). The main difference is 
that integrals have been replaced by sums. The reason h has disappeared from 
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the toy model expressions is that position and momentum are being expressed 
dimensionless units. 
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3.1 Hilbert space and inner product 

In Ch. 2 it was noted that quantum wave functions form a linear space in the sense 
that multiplying a function by a complex number or adding two wave functions 
together produces another wave function. It was also pointed out that a particular 
quantum state can be represented either by a wave function rj/(x) which depends 
upon the position variable x, or by an alternative function \fr(p) of the momentum 
variable p. It is convenient to employ the Dirac symbol \x/s), known as a “ket”, to 
denote a quantum state without referring to the particular function used to repre- 
sent it. The kets, which we shall also refer to as vectors to distinguish them from 
scalars, which are complex numbers, are the elements of the quantum Hilbert space 
H. (The real numbers form a subset of the complex numbers, so that when a scalar 
is referred to as a “complex number”, this includes the possibility that it might be 
a real number.) 

If a is any scalar (complex number), the ket corresponding to the wave function 
ax/s(x) is denoted by a\ij/), or sometimes by |t//)a, and the ket corresponding to 
(f>(x)+\/f(x) is denoted by \4>) + \^) or \i/} + \4>), and so forth. This correspondence 
could equally well be expressed using momentum wave functions, because the 
Fourier transform, (2.15) or (2.16), is a linear relationship between %lf(x) and i>(p), 
so that a(j)(x ) + fi^(x) and a<p(p) + pijr(p) correspond to the same quantum state 
a\ %/s) + P\<p). The addition of kets and multiplication by scalars obey some fairly 
obvious rules: 


= (ajW), {<x + PM) =ot\i/f) + fl\xlr), 
a(\(j>) + \f)) = a|0> +a|^), l|f) = W). 

Multiplying any ket by the number 0 yields the unique zero vector or zero ket, 
which will, because there is no risk of confusion, also be denoted by 0. 
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28 Linear algebra in Dirac notation 

The linear space H is equipped with an inner product 


l{\a>),\t)) = (<o\t) ( 3 . 2 ) 

which assigns to any pair of kets | co) and \xjr) a complex number. While the Dirac 
notation (co\yfr), already employed in Ch. 2, is more compact than the one based 
on X(, ), it is, for purposes of exposition, useful to have a way of writing the inner 
product which clearly indicates how it depends on two different ket vectors. 

An inner product must satisfy the following conditions: 

1. Interchanging the two arguments results in the complex conjugate of the 


original expression: 

Z(\1r), M) = [Z(M, W)]*- (3-3) 

2. The inner product is linear as a function of its second argument: 

l(\co), o#> + ft\xft)) = al(\co), I </>» + ftl{\a>), (3.4) 

3. The inner product is an antilinear function of its first argument: 

l(a\<t>) + ft\f), | co)) = a*l(\4>), I co)) + ft*l( \jr), M). (3.5) 

4. The inner product of a ket with itself, 

i{w),\'i')] = (m) = m 2 ( 3 . 6 ) 


is a positive (greater than 0) real number unless \xj/) is the zero vector, in 
which case — 0. 

The term “antilinear” in the third condition refers to the fact that the complex 
conjugates of a and ft appear on the right side of (3.5), rather than a and ft them- 
selves, as would be the case for a linear function. Actually, (3.5) is an immediate 
consequence of (3.3) and (3.4) — simply take the complex conjugate of both sides 
of (3.4), and then apply (3.3) — but it is of sufficient importance that it is worth 
stating separately. The reader can check that the inner products defined in (2.3) 
and (2.24) satisfy these conditions. (There are some subtleties associated with 
\js (x) when x is a continuous real number, but we must leave discussion of these 
matters to books on functional analysis.) 

The positive square root |f$|| of HV'II 2 in (3.6) is called the norm of As 
already noted in Ch. 2, a\jr) and \\jr) have exactly the same physical significance 
if a is a nonzero complex number. Consequently, as far as the quantum physicist 
is concerned, the actual norm, as long as it is positive, is a matter of indifference. 
By multiplying a nonzero ket by a suitable constant, one can always make its norm 
equal to 1 . This process is called normalizing the ket, and a ket with norm equal to 
1 is said to be normalized. Normalizing does not produce a unique result, because 
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where </> is an arbitrary real number or phase, has precisely the same norm 
as \\/s). Two kets \<p) and \x/s) are said to be orthogonal if = 0, which by 

(3.3) implies that {fs\(p) — 0. 


3.2 Linear functionals and the dual space 

Let |cw) be some fixed element of H. Then the function 

j(\f))=T(\w),\f)) (3.7) 

assigns to every \xjr) in H a complex number in a linear manner, 

J{« \4>) + Mj) = aJ(\4>)) + PJ{ W), (3-8) 

as a consequence of (3.4). Such a function is called a linear functional. There 
are many different linear functionals of this sort, one for every |<w) in 7i. In order 
to distinguish them we could place a label on J and, for example, write it as 
J\(o)(\^}). The notation J\ 0)) is a bit clumsy, even if its meaning is clear, and 
Dirac’s (&>|, called a “bra”, provides a simpler way to denote the same object, so 
that (3.8) takes the form 

(co\(a\(t>) + PM) = a(a>\4>) + P(a>M, (3.9) 

if we also use the compact Dirac notation for inner products. 

Among the advantages of (3.9) over (3.8) is that the former looks very much 
like the distributive law for multiplication if one takes the simple step of replac- 
ing (a > | • \ fr) by Indeed, a principal virtue of Dirac notation is that many 

different operations of this general type become “automatic”, allowing one to con- 
centrate on issues of physics without getting overly involved in mathematical book- 
keeping. However, if one is in doubt about what Dirac notation really means, it may 
be helpful to check things out by going back to the more awkward but also more 
familiar notation of functions, such as X(, ) and J (). 

Linear functionals can themselves be added together and multiplied by complex 
numbers, and the rules are fairly obvious. Thus the right side of 

[«<t| + P(co\ ] (M) = a(rW + P(a>M (3.10) 

gives the complex number obtained when the linear functional a { x \ + P (a> \ , formed 
by addition following multiplication by scalars, and placed inside square brackets 
for clarity, is applied to the ket \t/r). Thus linear functionals themselves form a 
linear space, called the dual of the space H; we shall denote it by fC . 

Although H and fC are not identical spaces — the former is inhabited by kets 
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and the latter by bras — the two are closely related. There is a one-to-one map 
from one to the other denoted by a dagger: 

M = (!"»*’ N=(H) f . (3.11) 

The parentheses may be omitted when it is obvious what the dagger operation 
applies to, but including them does no harm. The dagger map is antilinear, 

(o#> + P\^)f - 01*{<P\ + P*W\, (312) 

{y{ ft + <$<<y|) f = y*\r) + 8* \co), 

reflecting the fact that the inner product 1 is antilinear in its left argument, (3.5). 
When applied twice in a row, the dagger operation is the identity map: 

((M) + ) + = |w>, {((co\)y = (a) |. (3.13) 

There are occasions when the Dirac notation (co\xIj) is not convenient because it 
is too compact. In such cases the dagger operation can be useful, because (|&>)) + |i//-) 
means the same thing as (co\xlr). Thus, for example, 

(a|r> + p\co)) V> = (o*<r| + 0*{(o\)\ f) = a*{t\ x/r) + P*{a>M (3.14) 

is one way to express the fact the inner product is antilinear in its first argument, 
(3.5), without having to employ X(, ). 


3,3 Operators, dyads 

A linear operator, or simply an operator A is a linear function which maps H into 
itself. That is, to each \ijr) in H, A assigns another element A(\xf/}\ in H in such a 
way that 

A(a\(f>) + ft*)) = aA(\4>)) + 0A( M) (3.15) 

whenever \<j>) and \x/j) are any two elements of H, and a and /3 are complex num- 
bers. One customarily omits the parentheses and writes A\<j>) instead of A(|0)) 
where this will not cause confusion, as on the right (but not the left) side of (3.15). 
In general we shall use capital letters, A, B, and so forth, to denote operators. The 
letter I is reserved for the identity operator which maps every element of TL to 
itself: 


W) = W). 


(3.16) 


The zero operator which maps every element of H to the zero vector will be de- 
noted by 0. 
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The inner product of some element \<j>) of TL with the ket A 1 1 // ) can be written as 

(|0>) + A|^> = (4>\A\ir), (3.17) 

where the notation on the right side, the “sandwich” with the operator between a bra 
and a ket, is standard Dirac notation. It is often referred to as a “matrix element”, 
even when no matrix is actually under consideration. (Matrices are discussed in 
Sec. 3.6.) One can write (V>|A|V') as ((V>|A)(|V')), and think of it as the linear 
functional or bra vector 

(4>\A (3.18) 

acting on or evaluated at IV''). In this sense it is natural to think of a linear operator 
A on TL as inducing a linear map of the dual space TO onto itself, which carries (4> \ 
to (0| A. This map can also, without risk of confusion, be denoted by A, and while 
one could write it as A((V>|), in Dirac notation (V>| A is more natural. Sometimes 
one speaks of “the operator A acting to the left”. 

Dirac notation is particularly convenient in the case of a simple type of operator 
known as a dyad, written as a ket followed by a bra, |<w)(r|. Applied to some ket 
IV''} in TL, it yields 

\co){x\(\1r)) = \o>)(x\ir) = <r|V'}[®}. (3.19) 

Just as in (3.9), the first equality is “obvious” if one thinks of the product of (r| 
with IV') as (t|V'>, and since the latter is a scalar it can be placed either after or in 
front of the ket | co). Setting A in (3.17) equal to the dyad \oT) (r | yields 

<V>|(|m}<r|)|V'} - (V'MtrlV'}, (3.20) 

where the right side is the product of the two scalars (<f>\ co) and (r IV'). Once again 
the virtues of Dirac notation are evident in that this result is an almost automatic 
consequence of writing the symbols in the correct order. 

The collection of all operators is itself a linear space, since a scalar times an 
operator is an operator, and the sum of two operators is also an operator. The 
operator a A + /3B applied to an element | V'} of TL yields the result: 

{aA + OB) \f) = a(AIV')) + 0(5^}), (3.21) 

where the parentheses on the right side can be omitted, since (aA)lV') is equal to 
a(AIV')), and both can be written as aA|V'}. 

The product AB of two operators A and B is the operator obtained by first 
applying B to some ket, and then A to the ket which results from applying B: 

AB(|V'}) = A(5(|Vr>)). (3.22) 

Normally the parentheses are omitted, and one simply writes AB\\//). However, 
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it is very important to note that operator multiplication, unlike multiplication of 
scalars, is not commutative: in general, AB / BA, since there is no particular 
reason to expect that A(fi(|i/'’))) will be the same element of H as 5(A( |i/f})). 

In the exceptional case in which AB = BA, that is, AB \x//) = BA\xjr) for 
all \x/r), one says that these two operators commute with each other, or (simply) 
commute. The identity operator I commutes with every other operator, I A = 
A I = A, and the same is true of the zero operator, AO = OA = 0. The operators 
in a collection {Ai, A-i, A 3 , . . . } are said to commute with each other provided 

AjA k = A k Aj (3.23) 

for every j and k. 

Operator products follow the usual distributive laws, and scalars can be placed 
anywhere in a product, though one usually moves them to the left side: 

A(yC + SD) = yAC + SAD, 

(3.24) 

(aA + f$B)C = aAC + fiBC. 

In working out such products it is important that the order of the operators, from 
left to right, be preserved: one cannot (in general) replace AC with CA. The 
operator product of two dyads | co) ( r | and \^){4>\ is fairly obvious if one uses Dirac 
notation: 


M<r| • W){<t>\ - MWX0I = (T|^r)|o>)(0|, (3.25) 

where the final answer is a scalar (r| i/r) multiplying the dyad \co){(j>\. Multiplica- 
tion in the reverse order will yield an operator proportional to | \fr ) ( r | , so in general 
two dyads do not commute with each other. 

Given an operator A, if one can find an operator B such that 

AB — I — BA, (3.26) 

then B is called the inverse of the operator A, written as A -1 , and A is the in- 
verse of the operator B. On a finite-dimensional Hilbert space one only needs to 
check one of the equalities in (3.26), as it implies the other, whereas on an infinite- 
dimensional space both must be checked. Many operators do not possess inverses, 
but if an inverse exists, it is unique. 

The antilinear dagger operation introduced earlier, (3. 1 1) and (3.12), can also be 
applied to operators. For a dyad one has: 

(l <y )( T l) + = \t){to\. (3.27) 

Note that the right side is obtained by applying l separately to each term in the 
ket-bra “product” |<w)(r| on the left, following the prescription in (3.11), and then 
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writing the results in reverse order. When applying it to linear combinations of 
dyads, one needs to remember that the dagger operation is antilinear: 

(a\co)(r\ + - <x*\T)(co\ + P*m(4>\- (3.28) 

By generalizing (3.28) in an obvious way, one can apply the dagger operation to 
any sum of dyads, and thus to any operator on a finite-dimensional Hilbert space 
7i, since any operator can be written as a sum of dyads. However, the following 
definition is more useful. Given an operator A, its adjoint (A) 1 ', usually written as 
A r , is the unique operator such that 

W\A'\4>) = (<p\A\y/r)* (3.29) 

for any |</>) and |i//) in 7i. Note that bra and ket are interchanged on the two sides 
of the equation. A useful mnemonic for expressions such as (3.29) is to think of 
complex conjugation as a special case of the dagger operation when that is applied 
to a scalar. Then the right side can be written and successively transformed, 

((0|A|Vr)) t = (lV')) t A t (<0|) t = (Vr|A + |0>, (3.30) 

into the left side of (3.29) using the general rule that a dagger applied to a product 
is the product of the result of applying it to the individual factors, but written in the 
reverse order. 

The adjoint of a linear combination of operators is what one would expect, 

(aA + fBf = a*A T + fi*B\ (3.31) 

in light of (3.28) and the fact that the dagger operation is antilinear. The adjoint of 
a product of operators is the product of the adjoints in the reverse order. 

(AB) f = B^A\ (ABC) t = C*B*A\ (3.32) 

and so forth. The dagger operation, see (3.11), applied to a ket of the form A\x/r) 
yields a linear functional or bra vector 

(A|^>) + = (jr\A\ (3.33) 

where the right side should be interpreted in the same way as (3.18): the operator 
Al on TL induces a map, denoted by the same symbol A', on the space TC of linear 
functionals, by “operating to the left”. One can check that (3.33) is consistent with 
(3.29). 

An operator which is equal to its adjoint, A = A ' is said to be Hermitian or self- 
adjoint. (The two terms mean the same thing for operators on finite-dimensional 
spaces, but have different meanings for infinite-dimensional spaces.) Given that 
the dagger operation is in some sense a generalization of complex conjugation, one 
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will not be surprised to learn that Hermitian operators behave in many respects like 
real numbers, a point to which we shall return in Ch. 5. 


3.4 Projectors and subspaces 

A particular type of Hermitian operator called a projector plays a central role in 
quantum theory. A projector is any operator P which satisfies the two conditions 

P 2 = P, P 1 = P. (3.34) 

The first of these, P 2 — P, defines a projection operator which need not be Her- 
mitian. Hermitian projection operators are also called orthogonal projection oper- 
ators, but we shall call them projectors. Associated with a projector P is a linear 
subspace V of H consisting of all kets which are left unchanged by P, that is, those 
| VO for which P\jr) = \ x/r). We shall say that P projects onto V, or is the projec- 
tor onto V. The projector P acts like the identity operator on the subspace V. The 
identity operator I is a projector, and it projects onto the entire Hilbert space Ti. 
The zero operator 0 is a projector which projects onto the subspace consisting of 
nothing but the zero vector. 

Any nonzero ket \(p) generates a one-dimensional subspace V, often called a ray 
or (by quantum physicists) a pure state, consisting of all scalar multiples of |V>), 
that is to say, the collection of kets of the form {a|V>)}, where a is any complex 
number. The projector onto V is the dyad 

p = m = vnmim* ( 3 . 35 ) 

where the right side is simply \<f>){<t>\ if \<f>) is normalized, which we shall assume to 
be the case in the following discussion. The symbol [0] for the projector projecting 
onto the ray generated by \<f>) is not part of standard Dirac notation, but it is very 
convenient, and will be used throughout this book. Sometimes, when it will not 
cause confusion, the square brackets will be omitted: 4> will be used in place of 
[(/>]. It is straightforward to show that the dyad (3.35) satisfies the conditions in 
(3.34) and that 

P(a|V>» = 10) (01 («|0>) - a? |0) (0|0) - a|V>), (3.36) 

so that P leaves the elements of V unchanged. When it acts on any vector |x) 
orthogonal to |0), (0|x) = 0, P produces the zero vector: 

^lx) = \4>}{4>\x) = O|0) = 0. (3.37) 

The properties of P in (3.36) and (3.37) can be given a geometrical interpreta- 
tion, or at least one can construct a geometrical analogy using real numbers instead 
of complex numbers. Consider the two-dimensional plane shown in Fig. 3.1, with 
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Fig. 3.1. Illustrating: (a) an orthogonal (perpendicular) projection onto V\ (b) a nonorthog- 
onal projection represented by Q. 


vectors labeled using Dirac kets. The line passing through |</>) is the subspace V. 
Let \m) be some vector in the plane, and suppose that its projection onto V, along 
a direction perpendicular to V, Fig. 3.1(a), falls at the point a\(p). Then 

\co) = a\(P) + |x), (3.38) 

where |x) is a vector perpendicular (orthogonal) to \<j>), indicated by the dashed 
line. Applying P to both sides of (3.38), using (3.36) and (3.37), one finds that 

P\(o) — a\(/)). (3.39) 

That is, P on acting on any point \u>) in the plane projects it onto V along a line 
perpendicular to V, as indicated by the arrow in Fig. 3.1(a). Of course, such a 
projection applied to a point already on V leaves it unchanged, corresponding to 
the fact that P acts as the identity operation for elements of this subspace. For 
this reason, P(P(|<y))) is always the same as P(|cu}), which is equivalent to the 
statement that P 2 — P. It is also possible to imagine projecting points onto V 
along a fixed direction which is not perpendicular to V, as in Fig. 3.1(b). This 
defines a linear operator Q which is again a projection operator, since elements of 
V are mapped onto themselves, and thus Q 2 = Q. However, this operator is not 
Hermitian (in the terminology of real vector spaces, it is not symmetrical), so it is 
not an orthogonal (“perpendicular”) projection operator. 

Given a projector P, we define its complement, written as or P, also called 
the negation of P (see Sec. 4.4), to be the projector defined by 

P = I - P. (3.40) 

It is easy to show that P satisfies the conditions for a projector in (3.34) and that 
pp=0 = PP. (3.41) 
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From (3.40) it is obvious that P is, in turn, the complement (or negation) of P. Let 
V and V 1 - be the subspaces of TL onto which P and P project. Any element \oo) of 
V 1 - is orthogonal to any element \(p) of V: 

{o>\4>) = (M) f |0> = (P|ffl)) t (P|0)) = (co\PP\(p) = 0, (3.42) 

because PP — 0. Here we have used the fact that P and P act as identity operators 
on their respective subspaces, and the third equality makes use of (3.33). As a 
consequence, any element | jr) of PL can be written as the sum of two orthogonal 
kets, one belonging to V and one to 

W) = I\r/,) = P\xlf) + P\r/,). (3.43) 

Using (3.43), one can show that V 1 - is the orthogonal complement of V, the col- 
lection of all elements of H which are orthogonal to every ket in V. Similarly, V 
is the orthogonal complement (P ± ) ± of V ± . 


3.5 Orthogonal projectors and orthonormal bases 

Two projectors P and Q are said to be (mutually) orthogonal if 

PQ = 0. (3.44) 

By taking the adjoint of this equation, one can show that QP = 0, so that the 
order of the operators in the product does not matter. An orthogonal collection of 
projectors, or a collection of (mutually) orthogonal projectors is a set of nonzero 
projectors {P\, P 2 , . . .} with the property that 

PjP k = Ofor j ± k. (3.45) 

The zero operator never plays a useful role in such collections, and excluding it 
simplifies various definitions. 

Using (3.34) one can show that the sum P + Q of two orthogonal projectors P 
and Q is a projector, and, more generally, the sum 

R = J2 P J (346) 

j 

of the members of an orthogonal collection of projectors is a projector. When a 
projector R is written as a sum of projectors in an orthogonal collection, we shall 
say that this collection constitutes a decomposition or refinement of R. In particu- 
lar, if R is the identity operator 7, the collection is a decomposition (refinement) of 
the identity: 




(3.47) 
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We shall often write down a sum in the form (3.47) and refer to it as a “decompo- 
sition of the identity.” However, it is important to note that the decomposition is 
not the sum itself, but rather it is the set of summands, the collection of projectors 
which enter the sum. Whenever a projector R can be written as a sum of projectors 
in the form (3.46), it is necessarily the case that these projectors form an orthogo- 
nal collection, meaning that (3.45) is satisfied (see the Bibliography). Nonetheless 
it does no harm to consider (3.45) as part of the definition of a decomposition of 
the identity, or of some other projector. 

If two nonzero kets | co) and \<p) are orthogonal, the same is true of the corre- 
sponding projectors [&>] and [0], as is obvious from the definition in (3.35). An 
orthogonal collection of kets is a set {|<£i}, Ifa), . . . } of nonzero elements of hi 
such that ((pj\<pk} — 0 when j is unequal to k. If in addition the kets in such a 
collection are normalized, so that 

{<Pj\4>k) = S ]k , (3.48) 

we shall say that it is an orthonormal collection ; the word “orthonormal” combines 
“orthogonal” and “normalized”. The corresponding projectors {[0i], . . . } 

form an orthogonal collection, and 

W>,] I <Pk) - \(/>j)(<Pj\(l>k) = S jk \(/>j). (3.49) 

Let 7 Z be the subspace of hi consisting of all linear combinations of kets belonging 
to an orthonormal collection {| (pj)}, that is, all elements of the form 

W = £>,#;>, (3.50) 

j 

where the a j are complex numbers. Then the projector R onto hZ is the sum of the 
corresponding dyad projectors: 

R = Y,\<t>j)(4>j\ = Yl[<t>j]- (3-51) 

j j 

This follows from the fact that, in light of (3.49), R acts as the identity operator on 
a sum of the form (3.50), whereas R\co) — 0 for every \co) which is orthogonal to 
every | <j>j) in the collection, and thus to every \x/r) of the form (3.50). 

If every element of hi can be written in the form (3.50), the orthonormal collec- 
tion is said to form an orthonormal basis of hi, and the corresponding decomposi- 
tion of the identity is 

1 = Y,\<l , j)( < f>j\ = J2 W- (3-52) 

j j 

A basis of hi is a collection of linearly independent kets which span hi in the 
sense that any element of hi can be written as a linear combination of kets in the 
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collection. Such a collection need not consist of normalized states, nor do they have 
to be mutually orthogonal. However, in this book we shall for the most part use 
orthonormal bases, and for this reason the adjective “orthonormal” will sometimes 
be omitted when doing so will not cause confusion. 


3.6 Column vectors, row vectors, and matrices 

Consider a Hilbert space Tl of dimension n, and a particular orthonormal basis. To 
make the notation a bit less cumbersome, let us label the basis kets as | j) rather 
than | </>j). Then (3.48) and (3.52) take the forms 

U\k) = S jk , (3.53) 

I = Y1\MU (3-54) 

j 

and any element \\jr) of TL can be written as 

W) = (3-55) 

j 

By taking the inner product of both sides of (3.55) with | k), one sees that 

o k = (W), (3.56) 


and therefore (3.55) can be written as 


W) - Y^UW) \j) - E 1 (3.57) 
j j 

The form on the right side with the scalar coefficient (j \t/r) following rather than 
preceding the ket | j) provides a convenient way of deriving or remembering this 
result since (3.57) is the obvious equality \ijr) — I\\/s) with I replaced with the 
dyad expansion in (3.54). 

Using the basis {|y)}, the ket \x/r) can be conveniently represented as a column 
vector of the coefficients in (3.57): 


/<TO\ 

<2|V0 

Vw>/ 


(3.58) 


Because of (3.57), this column vector uniquely determines the ket |i ft), so as long 
as the basis is held fixed there is a one-to-one correspondence between kets and 
column vectors. (Of course, if the basis is changed, the same ket will be represented 
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by a different column vector.) If one applies the dagger operation to both sides of 
(3.57), the result is 

= MA (3.59) 

j 

which could also be written down immediately using (3.54) and the fact that (xjs | = 
(x/s \I. The numerical coefficients on the right side of (3.59) form a row vector 

(<V'|1>,<V'|2>,...<V'I«>) (3.60) 

which uniquely deter mi nes (x/r\, and vice versa. This row vector is obtained by 
“transposing” the column vector in (3.58) — that is, laying it on its side — and 
taking the complex conjugate of each element, which is the vector analog of ( x/s \ — 
( | i/y ) ) . An inner product can be written as a row vector times a column vector, in 
the sense of matrix multiplication: 

(M) = I>i mm- (3.6i) 


This can be thought of as {<p\ xjr) = {<f>\I\x/r) interpreted with the help of (3.54). 
Given an operator A on Tt, its jk matrix element is 

A jk = ( j\A\k >, (3.62) 

where the usual subscript notation is on the left, and the corresponding Dirac no- 
tation, see (3.17), is on the right. The matrix elements can be arranged to form a 
square matrix 


<1|A|1> 

(1 1 A |2) 

<l|A|n> 

<2| A 1 1) 

(2|A|2) ••• 

<2|A|n> 

,<n|A|l> 

(n\A\2) ••• 

{n\A\n) 


with the first or left index j of (j\A\k) labeling the rows, and the second or right 
index k labeling the columns. It is sometimes helpful to think of such a matrix 
as made up of a collection of n row vectors of the form (3.60), or, alternatively, n 
column vectors of the type (3.58). The matrix elements of the adjoint A^ of the 
operator A are given by 


(j\A^\k) = (k\A\j)*, 


(3.64) 


which is a particular case of (3.29). Thus the matrix of A^ is the complex conjugate 
of the transpose of the matrix of A. If the operator A = A^ is Hermitian, (3.64) 
implies that its diagonal matrix elements (j\A\j) are real. 
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Let us suppose that the result of A operating on \tjr) is a ket 

\<t>) = A\x/f). (3.65) 

By multiplying this on the left with the bra {k |, and writing A as AI with I in the 
form (3.54), one obtains 

<*!*> = X>|A[|K/|*>. (3.66) 

j 

That is, the column vector for \(p) is obtained by multiplying the matrix for A times 
the column vector for \x//), following the usual rule for matrix multiplication. This 
shows, incidentally, that the operator A is uniquely determined by its matrix (given 
a fixed orthonormal basis), since this matrix determines how A maps any \xls) of 
the Hilbert space onto A\\j/). Another way to see that the matrix deter mi nes A is 
to write A as a sum of dyads, starting with A = IAI and using (3.54): 

A = EE \j)(j\A\k)(k\ = (3-6V) 

j k j k 

The matrix of the product AB of two operators is the matrix product of the two 
matrices, in the same order: 

{j\AB\k) = Y J {j\A\i)(i\B\k), (3.68) 

an expression easily derived by writing AB = AIB and invoking the invaluable 
(3.54). 


3.7 Diagonalization of Hermitian operators 

Books on linear algebra show that if A = A 1 ' is Hermitian, it is always possible 
to find a particular orthonormal basis {|a ; }} such that in this basis the matrix of 
A is diagonal, that is, (a ; | A|a/.) = 0 whenever j ^ k. The diagonal elements 
aj — (oij\A\aj) must be real numbers in view of (3.64). By using (3.67) one can 
write A in the form 

A = Y^aj\aj}(aj\ = Y^ a ji a jl (3-69) 

j j 

a sum of real numbers times projectors drawn from an orthogonal collection. The 
ket | dj) is an eigenvector or eigenket of the operator A with eigenvalue ay. 

A\ot j )=a j \ot j ). (3.70) 

An eigenvalue is said to be degenerate if it occurs more than once in (3.69), and its 
multiplicity is the number of times it occurs in the sum. An eigenvalue which only 
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occurs once (multiplicity of 1) is called nondegenerate. The identity operator has 
only one eigenvalue, equal to 1, whose multiplicity is the dimension n of the Hilbert 
space. A projector has only two distinct eigenvalues: 1 with multiplicity equal to 
the dimension m of the subspace onto which it projects, and 0 with multiplicity 
n — m. 

The basis which diagonalizes A is unique only if all its eigenvalues are non- 
degenerate. Otherwise this basis is not unique, and it is sometimes more convenient 
to rewrite (3.69) in an alternative form in which each eigenvalue appears just once. 
The first step is to suppose that, as is always possible, the kets |a ; ) have been 
indexed in such a fashion that the eigenvalues are a nondecreasing sequence: 

a\ < a 2 < a?, < ■ ■ ■ . (3.71) 

The next step is best explained by means of an example. Suppose that n — 5, and 
that a\ = a 2 < < a 4 = a 5 . That is, the multiplicity of a\ is 2, that of 03 is 1, 

and that of a 4 is 2. Then (3.69) can be written in the form 

A — aiP\ + 03^2 + CI 4 P 3 , (3.72) 

where the three projectors 

Pi = |ai)(o;i| + |a 2 }(a 2 |, P2 = |a 3 )(a 3 |, 

P 3 = |a 4 ><a 4 | + |a 5 ><a 5 | 

form a decomposition of the identity. By relabeling the eigenvalues as 

a[ — a\, a ' 2 — #3, 03 = a 4 , (3.74) 

it is possible to rewrite (3.72) in the form 

A = I Z a 'j p j’ < 3 - 75 ) 

j 

where no two eigenvalues are the same: 

a'j # a' k for ; ^ k. (3.76) 

Generalizing from this example, we see that it is always possible to write a Her- 
mitian operator in the form (3.75) with eigenvalues satisfying (3.76). If all the 
eigenvalues of A are nondegenerate, each Pj projects onto a ray or pure state, and 
(3.75) is just another way to write (3.69). 

One advantage of using the expression (3.75), in which the eigenvalues are un- 
equal, in preference to (3.69), where some of them can be the same, is that the 
decomposition of the identity {Pj} which enters (3.75) is uniquely determined by 
the operator A. On the other hand, if an eigenvalue of A is degenerate, the corre- 
sponding eigenvectors are not unique. In the example in (3.72) one could replace 
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| of i } and \oi 2 ) by any two normalized and mutually orthogonal kets \a\ ) and \a' 2 ) 
belonging to the two-dimensional subspace onto which Pi projects, and similar 
considerations apply to \a^) and |a 5 ). We shall call the (unique) decomposition 
of the identity {Pj} which allows a Hermitian operator A to be written in the form 
(3.75) with eigenvalues satisfying (3.76) the decomposition corresponding to or 
generated by the operator A. 

If {A, B, C, . . . } is a collection of Hermitian operators which commute with 
each other, (3.23), they can be simultaneously diagonalized in the sense that there 
is a single orthonormal basis | <f>j) such that 

A = J2ajl(f>jl B = Y i b j[4>j], C = J2 c i[<t>jl (3 -77) 

j i i 

and so forth. If instead one writes the operators in terms of the decompositions 
which they generate, as in (3.75), 

A = I Z a 'i P P S = C = J2 c i R i> (3-78) 

j k l 

and so forth, the projectors in each decomposition commute with the projectors in 
the other decompositions: Pj Qk — QkPj, etc. 


3.8 Trace 

The trace of an operator A is the sum of its diagonal matrix elements: 

Tr(A) = £<y|A|j>. (3.79) 

j 

While the individual diagonal matrix elements depend upon the orthonormal basis, 
their sum, and thus the trace, is independent of basis and depends only on the 
operator A. The trace is a linear operation in that if A and B are operators, and a 
and are scalars, 


Tr(aA + j$B) = aTr(A) + 0Tr(fl). 


The trace of a dyad is the corresponding inner product, 

Tr(|0)<r]) = ^{j\(p){r\i) - (t| </>}, 
j 


(3.80) 


(3.81) 


as is clear from (3.61). 
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The trace of the product of two operators A and B is independent of the order of 
the product, 


Tr(AB) = Tr(BA), 


(3.82) 


and the trace of the product of three or more operators is not changed if one makes 
a cyclic permutation of the factors: 

Tr(ABC) = Tr(BCA) = Tr(CAB), 

Tr (ABCD) = Tr {BCD A) = Tr(CDAB) = Tr(DABC), 

and so forth; the cycling is done by moving the operator from the first position in 
the product to the last, or vice versa. By contrast, Tr(ACB) is, in general, not the 
same as Tr (ABC), for AC B is obtained from ABC by interchanging the second 
and third factor, and this is not a cyclic permutation. 

The complex conjugate of the trace of an operator is equal to the trace of its 
adjoint, as is evident from (3.64), and a similar rule applies to products of operators, 
where one should remember to reverse the order, see (3.32): 

(Tr(A))* = Tr(A f ), 

v ; (3 84) 

(Tr(ABC))* = Tr(C t B t A t ), 

etc.; additional identities can be obtained using cyclic permutations, as in (3.83). 

If A = A 1 ' is Hermitian, one can calculate the trace in the basis in which A is 
diagonal, with the result 


Tr(A) = J>,-. (3.85) 

j = i 

That is, the trace is equal to the sum of the eigenvalues appearing in (3.69). In 
particular, the trace of a projector P is the dimension of the subspace onto which it 
projects. 


3.9 Positive operators and density matrices 

A Hermitian operator A is said to be a positive operator provided 

W\A\ir)>() (3.86) 

holds for every | x/r) in the Hilbert space or, equivalently, if all its eigenvalues are 
nonnegative: 


aj > 0 for all j. 


(3.87) 
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While (3.87) is easily shown to imply (3.86), and vice versa, memorizing both 
definitions is worthwhile, as sometimes one is more useful, and sometimes the 
other. 

If A is a positive operator and a a positive real number, then a A is a positive 
operator. Also the sum of any collection of positive operators is a positive operator; 
this is an obvious consequence of (3.86). The support of a positive operator A is 
defined to be the projector A v , or the subspace onto which it projects, given by the 
sum of those \aj | in (3.69) with cij > 0, or of the Pj in (3.75) with a'- > 0. It 
follows from the definition that 


A S A = A. (3.88) 

An alternative definition is that the support of A is the smallest projector A s , in the 
sense of minimizing Tr(A s ), which satisfies (3.88). 

The trace of a positive operator is obviously a nonnegative number, see (3.85) 
and (3.87), and is strictly positive unless the operator is the zero operator with all 
zero eigenvalues. A positive operator A which is not the zero operator can always 
be normalized by defining a new operator 

A = A/Tr(A) (3.89) 

whose trace is equal to 1. In quantum physics a positive operator with trace equal 
to 1 is called a density matrix. The terminology is unfortunate, because a density 
matrix is an operator, not a matrix, and the matrix for this operator depends on 
the choice of orthonormal basis. However, by now the term is firmly embedded 
in the literature, and attempts to replace it with something more rational, such as 
“statistical operator”, have not been successful. 

If C is any operator, then CAC is a positive operator, since for any |i jr), 

W\C'C\t) = (</#> > 0, (3.90) 

where \<p) = C \xjr). Consequently, Tr(C f C) is nonnegative. If Tr(C^C) = 0, then 
C^C — 0, and C\y]/) vanishes for every \x/r), which means that C\x/r) is zero 
for every \\/s), and therefore C = 0. Thus for any operator C it is the case that 

Tr(C f C) > 0, (3.91) 

with equality if and only if C = 0. 

The product AB of two positive operators A and B is, in general, not Hermitian, 
and therefore not a positive operator. However, if A and B commute, AB is pos- 
itive, as can be seen from the fact that there is an orthonormal basis in which the 
matrices of both A and B, and therefore also AB, are diagonal. This result gener- 
alizes to the product of any collection of commuting positive operators. Whether 
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or not A and B commute, the fact that they are both positive means that Tr(AS) is 
a real, nonnegative number, 

Tr(AB) = Tr(BA) > 0, (3.92) 

equal to 0 if and only if AB = BA = 0. This result does not generalize to a 
product of three or more operators: if A, B, and C are positive operators that do not 
commute with each other, there is in general nothing one can say about Tr(ABC). 

To derive (3.92) it is convenient to first define the square root A 1 / 2 of a positive 
operator A by means of the formula 

A' /2 = (3-93) 

j 

where Ja] is the positive square root of the eigenvalue aj in (3.69). Then when A 
and B are both positive, one can write 

Tr(AB) = Tr(A 1/2 A 1/2 B 1/2 B 1/2 ) 

= Tr(A 1/2 B 1/2 B 1/2 A 1/2 ) = Tr(C f C) > 0, (3.94) 

where C = S 1 / 2 A 1 / 2 . If Tr(AB) = 0, then, see (3.91), C = 0 = C r , and both 

BA = B'I 2 CA' 12 and AB = A 1 / 2 C t S 1/2 vanish. 


3.10 Functions of operators 

Suppose that f{x ) is an ordinary numerical function, such as x 2 or e x . It is some- 
times convenient to define a corresponding function / (A) of an operator A, so that 
the value of the function is also an operator. When f(x) is a polynomial 

f (x) — a 0 + aix + a 2 x 2 -\ a p x p , (3.95) 

one can write 

f (A) = ao I + a\A + a 2 A 2 + • • • a p A p , (3.96) 

since the square, cube, etc. of any operator is itself an operator. When f(x) is 
represented by a power series, as in log(l + x) — x — x 2 /2 + • • • , the same 
procedure will work provided the series in powers of the operator A converges, but 
this is often not a trivial issue. 

An alternative approach is possible in the case of operators which can be diago- 
nalized in some orthonormal basis. Thus if A is written in the form (3.69), one can 
define /(A) to be the operator 

/(A) = £/(a,)K-], 


(3.97) 
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where f(aj ) on the right side is the value of the numerical function. This agrees 
with the previous definition in the case of polynomials, but allows the use of much 
more general functions. As an example, the square root A 1 / 2 of a positive operator 
A as defined in (3.93) is obtained by setting f(x) = *Jx for x > 0 in (3.97). 
Note that in order to use (3.97), the numerical function f(x) must be defined for 
any x which is an eigenvalue of A. For a Hermitian operator these eigenvalues 
are real, but in other cases, such as the unitary operators discussed in Sec. 7.2, the 
eigenvalues may be complex, so for such operators f{x) will need to be defined 
for suitable complex values of x. 
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Physical properties 


4.1 Classical and quantum properties 

We shall use the term physical property to refer to something which can be said to 
be either true or false for a particular physical system. Thus “the energy is between 
10 and 12 or “the particle is between xi and xf' are examples of physical prop- 
erties. One must distinguish between a physical property and a physical variable, 
such as the position or energy or momentum of a particle. A physical variable can 
take on different numerical values, depending upon the state of the system, whereas 
a physical property is either a true or a false description of a particular physical sys- 
tem at a particular time. A physical variable taking on a particular value, or lying 
in some range of values, is an example of a physical property. 

In the case of a classical mechanical system, a physical property is always as- 
sociated with some subset of points in its phase space. Consider, for example, 
a harmonic oscillator whose phase space is the x, p plane shown in Fig. 2.1 on 
page 12. The property that its energy is equal to some value Eq > 0 is associated 
with a set of points on an ellipse centered at the origin. The property that the energy 
is less than E$ is associated with the set of points inside this ellipse. The property 
that the position x lies between x\ and xi corresponds to the set of points inside a 
vertical band shown cross-hatched in this figure, and so forth. 

Given a property P associated with a set of points V in the phase space, it is 
convenient to introduce an indicator function, or indicator for short, P(y), where 
y is a point in the phase space, defined in the following way: 


P(y) = 


if y eV, 
otherwise. 


(4.1) 


(It is convenient to use the same symbol P for a property and for its indicator, as 
this will cause no confusion.) Thus if at some instant of time the phase point yo 
associated with a particular physical system is inside the set V, then P(yf) = 1, 
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meaning that the system possesses this property, or the property is true. Similarly, 
if P(yo) — 0, the system does not possess this property, so for this system the 
property is false. 

A physical property of a quantum system is associated with a subspace V of the 
quantum Hilbert space H in much the same way as a physical property of a classical 
system is associated with a subset of points in its phase space, and the projector P 
onto V, Sec. 3.4, plays a role analogous to the classical indicator function. If the 
quantum system is described by a ket | \jr) which lies in the subspace V, so that \fij 
is an eigenstate of P with eigenvalue 1, 

P\f) = \f), (4-2) 

one can say that the quantum system has the property P. On the other hand, if \j/} 
is an eigenstate of P with eigenvalue 0, 

P 1^ = 0, (4.3) 

then the quantum system does not have the property P, or, equivalently, it has 
the property P which is the negation of P, see Sec. 4.4. When \xj/) is not an 
eigenstate of P, a situation with no analog in classical mechanics, we shall say that 
the property P is undefined for the quantum system. 


4.2 Toy model and spin half 

In this section we will consider various physical properties associated with a toy 
model and with a spin-half particle, and in Sec. 4.3 properties of a continuous 
quantum system, such as a particle with a wave function In Sec. 2.5 we 

introduced a toy model with wave function fi(m), where the position variable m is 
an integer restricted to taking on one of the M — M a + Mb + 1 values in the range 

— M a < m < Mb. (4.4) 

In (2.26) we defined a wave function Xn (m) — S mn whose physical significance is 
that the particle is at the site (or in the cell) n. Let \n) be the corresponding Dirac 
ket. Then (2.24) tells us that 

<*l«> = hn, (4.5) 

so the kets {\n)} form an orthonormal basis of the Hilbert space. 

Any scalar multiple a\n) of |n), where a is a nonzero complex number, has pre- 
cisely the same physical significance as | n). The set of all kets of this form together 
with the zero ket, that is, the set of all multiples of \n), form a one-dimensional sub- 
space of PL, and the projector onto this subspace, see Sec. 3.4, is 


[n] = \n)(n\. 


(4.6) 
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Thus it is natural to associate the property that the particle is at the site n (some- 
thing which can be true or false) with this subspace, or, equivalently, with the corre- 
sponding projector, since there is a one-to-one correspondence between subspaces 
and projectors. 

Since the projectors [0] and [1] for sites 0 and 1 are orthogonal to each other, 
their sum is also a projector 


R = [0] + |1]. (4.7) 

The subspace 1Z onto which R projects is two-dimensional and consists of all linear 
combinations of |0) and 1 1), that is, all states of the form 

\4>) = «|0> + j8|l). (4.8) 

Equivalently, it corresponds to all wave functions x/r(m) which vanish when m is 
unequal to 0 or 1. The physical significance of 1Z, see the discussion in Sec. 2.3, 
is that the toy particle is not outside the interval [0, 1], where, since we are using a 
discrete model, the interval [0, 1] consists of the two sites m — 0 and m — 1. One 
can interpret “not outside” as meaning “inside”, provided that is not understood to 
mean “at one or the other of the two sites m = 0 or m — 1 .” 

The reason one needs to be cautious is that a typical state in 7 Z will be of the 
form (4.8) with both a and (i unequal to zero. Such a state does not have the 
property that it is at m — 0, for all states with this property are scalar multiples of 
|0), and \<p) is not of this form. Indeed, |</>) is not an eigenstate of the projector [0] 
representing the property m = 0, and hence according to the definition given at the 
end of Sec. 4.1, the property m — 0 is undefined. The same comments apply to 
the property m — 1 . Thus it is certainly incorrect to say that the particle is either 
at 0 or at 1 . Instead, the particle is represented by a delocalized wave, as discussed 
in Sec. 2.3. There are some states in 7 Z which are localized at 0 or localized at 
1, but since 1Z also contains other, delocalized, states, the property corresponding 
to 1Z or its projector R, which holds for all states in this subspace, needs to be 
expressed by some English phrase other than “at 0 or 1”. The phrases “not outside 
the interval [0, 1]” or “no place other than 0 or 1,” while they are a bit awkward, 
come closer to saying what one wants to say. The way to be perfectly precise is to 
use the projector R itself, since it is a precisely defined mathematical quantity. But 
of course one needs to build up an intuitive picture of what it is that R means. 

The process of building up one’s intuition about the meaning of R will be aided 
by noting that (4.7) is not the only way of writing it as a sum of two orthogonal 
projectors. Another possibility is 


R = [cr] + [r], 


(4.9) 
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where 

|<x> = (|0> + /|l»/V2, |r> = (|0>-j|l»/V2 (4.10) 

are two normalized states in 7Z which are mutually orthogonal. To check that (4.9) 
is correct, one can work out the dyad 

|cr)(<7|=.i(|0) + f|l))((0|-i<l|) 

= 5 (|0) (0| + 1 1) ( 1 1 + ?|1)(0| — i|0)(l|), (4.11) 

where (cr| = (|<t)) 1 has been formed using the rules for the dagger operation (note 
the complex conjugate) in (3.12), and (4.11) shows how one can conveniently mul- 
tiply things out to find the resulting projector. The dyad |r)(r| is the same except 
for the signs of the imaginary terms, so adding this to |cr) (cr | gives R. There are 
many other ways besides (4.9) to write R as a sum of two orthogonal projectors. 
In fact, given any normalized state in 1Z, one can always find another normalized 
state orthogonal to it, and the sum of the dyads corresponding to these two states 
is equal to R. The fact that R can be written as a sum P + Q of two orthogonal 
projectors P and Q in many different ways is one reason to be cautious in assign- 
ing R a physical interpretation of “property P or property Q”, although there are 
occasions, as we shall see later, when such an interpretation is appropriate. 

The simplest nontrivial toy model has only M — 2 sites, and it is convenient to 
discuss it using language appropriate to the spin degree of freedom of a particle of 
spin 1 /2. Its Hilbert space TL consists of all linear combinations of two mutually 
orthogonal and normalized states which will be denoted by |z + ) and |z“), and 
which one can think of as the counterparts of |0) and 1 1 ) in the toy model. (In 
the literature they are often denoted by |+) and |— ), or by |/) and | /).) The 
normalization and orthogonality conditions are 

( Z +| Z +) = 1 = ( z -\z~), <z + IO = 0. (4.12) 

The physical significance of |z + ) is that the z-component S z of the internal or “spin” 
angular momentum has a value of + 1/2 in units of h, while |z“) means that S- = 
— 1/2 in the same units. One sometimes refers to |z + ) and |z“) as “spin up” and 
“spin down” states. 

The two-dimensional Hilbert space H consists of all linear combinations of the 
form 


a\z + ) + P\z~h (4.13) 

where a and /3 are any complex numbers. It is convenient to parameterize these 
states in the following way. Let w denote a direction in space corresponding to 
ft, <p in spherical polar coordinates. For example, ft — 0 is the +z direction, while 
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ft — 7r/2 and (p — n is the — x direction. Then define the two states 

|ui + ) = +cos(tf/2>’- , > /2 |z + } + sin(iV2)e !V/2 |z“k 
| w~) = - sin(#/2)e“ I>/2 |z + ) + cosift /2)e i(pl \-). 
These are normalized and mutually orthogonal, 

(m + |u; + > = 1 = {w~\w~), (u> + |uT} = 0, 


(4.14) 


(4.15) 


as a consequence of (4.12). 

The physical significance of |uj + ) is that S w , the component of spin angular 
momentum in the w direction, has a value of 1/2, whereas for |w+), S w has the 
value —1/2. For ft — 0, |w + ) and |ui - } are the same as |z + ) and |z“), respectively, 
apart from a phase factor, e~ l(p , which does not change their physical significance. 
For ft = 7T, |u; + ) and \w~) are the same as |z“) and |z + ), respectively, apart from 
phase factors. Suppose that in is a direction which is neither along nor opposite to 
the z axis, for example, w — x. Then both a and (i in (4.13) are nonzero, and \w + ) 
does not have the property S z — +1/2, nor does it have the property S z = —1/2. 
The same is true if S z is replaced by S v , where v is any direction which is not the 
same as w or opposite to w. The situation is analogous to that discussed earlier for 
the toy model: think of |z + ) and |z“) as corresponding to the states | m — 0) and 
I m - 1). 

Any nonzero wave function (4.13) can be written as a complex number times 
|ui + ) for a suitable choice of the direction w. For /S = 0, the choice w — z is 
obvious, while for a — 0 it is w — —z. For other cases, write (4.13) in the form 

P [{a/P)\ z + ) + lO] . (4.16) 

A comparison with the expression for |ui + ) in (4.14) shows that ft and cp are deter- 
mined by the equation 

e- i<p cot(ft/2) = a/P, (4.17) 


which, since a/fi is finite (neither 0 nor oo), always has a unique solution in the 
range 

0 < <p < 2 jt, 0 < ft < it. (4.18) 


4.3 Continuous quantum systems 

This section deals with the quantum properties of a particle in one dimension de- 
scribed by a wave function depending on the continuous variable x. Sim- 
ilar considerations apply to a particle in three dimensions with a wave function 
i//(r), and the same general approach can be extended to apply to collections of 



52 


Physical properties 


several particles. Quantum properties are again associated with subspaces of the 
Hilbert space H, and since 7i is infinite-dimensional, these subspaces can be either 
finite- or infinite-dimensional; we shall consider examples of both. (For infinite- 
dimensional subspaces one adds the technical requirement that they be closed, as 
defined in books on functional analysis.) 

As a first example, consider the property that a particle lies inside (which is to 
say not outside) the interval 

xi<x<x 2 , (4.19) 


with x\ < x 2 . As pointed out in Sec. 2.3, the (infinite-dimensional) subspace X 
which corresponds to this property consists of all wave functions which v a nish 
for x outside the interval (4.19). The projector X associated with X is defined as 
follows. Acting on some wave function V/(x), X produces a new function ^ x (x) 
which is identical to x/s(x) for x inside the interval (4.19), and zero outside this 
interval: 


txi.x) = Xx/r(x) = 


rlr(x) 

0 


for xi < x < X 2 , 
for x < xi or x > X 2 . 


(4.20) 


Note that X leaves unchanged any function which belongs to the subspace X, so 
it acts as the identity operator on this subspace. If a wave function co(x) vanishes 
throughout the interval (4.19), it will be orthogonal to all the functions in X, and 
X applied to tu(x) will yield a function which is everywhere equal to 0. Thus X 
has the properties one would expect for a projector as discussed in Sec. 3.4. 

One can write (4.20) using Dirac notation as 


Wx) = X\xlr), 


(4.21) 


where the element | ijs) of the Hilbert space can be represented either as a position 
wave function x/r(x) or as a momentum wave function the Fourier transform 
of i//(x), see (2.15). The relationship (4.21) can also be expressed in terms of 
momentum wave functions as 

&x(p) = J HP ~ p')^(p')dp', (4.22) 

where xftxip) is the Fourier transform of \j/ x (x), and f(p) is the Fourier transform 
of 


$(*) = 


for xi < x < X 2 , 
for x < x\ or x > X 2 . 


(4.23) 


Although (4.20) is the most straightforward way to define X\\/s), it is important to 
note that the expression (4.22) is completely equivalent. 
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As another example, consider the property that the momentum of a particle lies 
in (that is, not outside) the interval 


Pi < P < Pi- 


(4.24) 


This property corresponds, Sec. 2.4, to the subspace V of momentum wave func- 
tions rj/(p) which vanish outside this interval. The projector P corresponding to V 
can be defined by 


I ij/(p) 

0 


for p\ < p < P 2 , 
for p < pi or p > p 2 . 


and in Dirac notation (4.25) takes the form 


(4.25) 


\fp) = PW)- (4-26) 

One could also express the position wave function \/sp(x) in terms of ijj(x) using a 
convolution integral analogous to (4.22). 

As a third example, consider a one-dimensional harmonic oscillator. In text- 
books it is shown that the energy E of an oscillator with angular frequency co takes 
on a discrete set of values 


E = n + n = 0, 1,2, ... (4.27) 

in units of hco. Let <p n (x) be the normalized wave function for energy E — n + 1/2, 
and | <f>„) the corresponding ket. The one-dimensional subspace of H consisting of 
all scalar multiples of \<f>„) represents the property that the energy is n + 1 /2. The 
corresponding projector is the dyad \<f>„\. When this projector acts on some \x/r) in 
7 i, the result is 

W) = WnM) = 1 4>nH4>nm = (4>nW I </>„>, (4.28) 

that is, | <p n ) multiplied by the scalar {(t> n \ijj). One can write (4.28) using wave 
functions in the form 

f(x) = J P(x,x')ir(x')dx', (4.29) 

where 

P(x, x') = (/)„(x)0*(x , ) (4.30) 

corresponds to the dyad |$„)(0„|. 

Since the states \<p n ) for different n are mutually orthogonal, the same is true 
of the corresponding projectors [<£„]. Using this fact makes it easy to write down 
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projectors for the energy to lie in some interval which includes two or more energy 
eigenvalues. For example, the projector 

q = [0i ] + m (4.3i) 

onto the two-dimensional subspace of Tl consisting of linear combinations of \(pi } 
and |02) expresses the property that the energy E (in units of hco) lies inside some 
interval such as 


1 < E < 3, (4.32) 

where the choice of endpoints of the interval is somewhat arbitrary, given that the 
energy is quantized and takes on only discrete values; any other interval which 
includes 1.5 and 2.5, but excludes 0.5 and 3.5, would be just as good. The action 
of Q on a wave function 0 (x) can be written as 


0(x) = g0(x) = J Q(x,x')\ls(x')dx', 

(4.33) 

Q(x,x') = 0l(x)0*(x') + 0 2 (x)02(x'). 

(4.34) 


Once again, it is important not to interpret “energy lying inside the interval (4.32)” 
as meaning that it either has the value 1.5 or that it has the value 2.5. The subspace 
onto which Q projects also contains states such as |0i) + 102>, for which the energy 
cannot be defined more precisely than by saying that it does not lie outside the 
interval, and thus the physical property expressed by Q cannot have a meaning 
which is more precise than this. 


4.4 Negation of properties (NOT) 

A physical property can be true or false in the sense that the statement that a par- 
ticular physical system at a particular time possesses a physical property can be 
either true or false. Books on logic present simple logical operations by which 
statements which are true or false can be transformed into other statements which 
are true or false. We shall consider three operations which can be applied to phys- 
ical properties: negation, taken up in this section, and conjunction and disjunction, 
taken up in Sec. 4.5. In addition, quantum properties are sometimes incompatible 
or “noncomparable”, a topic discussed in Sec. 4.6. 

As noted in Sec. 4.1, a classical property P is associated with a subset V con- 
sisting of those points in the classical phase space for which the property is true. 
The points of the phase space which do not belong to V form the complementary 
set^V, and this complementary set defines the negation “NOT P” of the property 
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P. We shall write it as ~ P or as P. Alternatively, one can define P as the property 
which is true if and only if P is false, and false if and only if P is true. From this as 
well as from the other definition it is obvious that the negation of the negation of a 
property is the same as the original property: ~ P or ~ (~ P) is the same property 
as P. The indicator P(y) of the property ~ P, see (4.1), is given by the formula 

P = I-P, (4.35) 

or P(y) — I(y) ~ P(y)> where I(y), the indicator of the identity property, is 
equal to 1 for all values of y. Thus P is equal to 1 (true) if P is 0 (false), and 
P = 0 when P = 1. 

Once again consider Fig. 2.1 on page 12, the phase space of a one-dimensional 
harmonic oscillator, where the ellipse corresponds to an energy Eq. The property 
P that the energy is less than or equal to E 0 corresponds to the set V of points 
inside and on the ellipse. Its negation P is the property that the energy is greater 
than Eo, and the corresponding region ~ V is all the points outside the ellipse. The 
vertical band Q corresponds to the property Q that the position of the particle is in 
the interval x\ < x < x^. The negation of Q is the property Q that the particle lies 
outside this interval, and the corresponding set of points ~ Q in the phase space 
consists of the half planes to the left of x — x\ and to the right of x — x-i. 

A property of a quantum system is associated with a subspace of the Hilbert 
space, and thus the negation of this property will also be associated with some 
subspace of the Hilbert space. Consider, for example, a toy model with M a — 2 — 
Mb . Its Hilbert space consists of all linear combinations of the states | — 2), | — 1), 
|0), 1 1) , and 1 2). Suppose that P is the property associated with the projector 

P = [0] + [1] (4.36) 

projecting onto the subspace V of all linear combinations of |0) and 1 1 ) . Its physical 
interpretation is that the quantum particle is confined to these two sites, that is, it is 
not at some location apart from these two sites. The negation P of P is the property 
that the particle is not confined to these two sites, but is instead someplace else, so 
the corresponding projector is 

P = [-2] + [-1] + [2], (4.37) 

This projects onto the orthogonal complement V ± of V, see Sec. 3.4, consisting of 
all linear combinations of | — 2), | — 1) and \2). Since the identity operator for this 
Hilbert space is given by 

1 = E M, 

m =—2 


(4.38) 
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see (3.52), it is evident that 


P = I-P. 


(4.39) 


This is precisely the same as (4.35), except that the symbols now refer to quantum 
projectors rather than to classical indicators. 

As a second example, consider a one-dimensional harmonic oscillator, Sec. 4.4. 
Suppose that P is the property that the energy is less than or equal to 2 in units of 
hco. The corresponding projector is 

P = [0o] + [0,] (4.40) 


in the notation used in Sec. 4.3. The negation of P is the property that the energy 
is greater than 2, and its projector is 

p _ [fc] + [fc] + m + ••• = /- p. (4.4i) 


In this case, P projects onto a finite and P onto an infinite-dimensional subspace 
of U. 

As a third example, consider the property X that a particle in one dimension is 
located in (that is, not outside) the interval (4.19), jti < x < xi\ the corresponding 
projector X was defined in (4.20). Using the fact that I\jr(x) — i/r(x), it is easy to 
show that the projector X — I — X, corresponding to the property that the particle 
is located outside (not inside) the interval (4.41) is given by 


ZVKx) = 


1 ° 

\f(x) 


for x\ < x < X 2 , 
for all other x values. 


(4.42) 


(Note that in this case the action of the projectors X and X is to multiply i[r(x) by 
the indicator function for the corresponding classical property.) 

As a final example, consider a spin-half particle, and let P be the property S z — 
+ 1/2 (in units of h) corresponding to the projector [z + ] . One can think of this as 
analogous to a toy model with M = 2 sites m = 0,1, where [z + ] corresponds to 
[0]. Then it is evident from the earlier discussion that the negation P of P will 
be the projector [z ], the counterpart of [1] in the toy model, corresponding to 
the property S z — — 1 /2. Of course, the same reasoning can be applied with z 
replaced by an arbitrary direction w: The property S w = —1/2 is the negation of 
S w — +1/2, and vice versa. 

The relationship between the projector for a quantum property and the projector 
for its negation, (4.39), is formally the same as the relationship between the cor- 
responding indicators for a classical property, (4.35). Despite this close analogy, 
there is actually an important difference. In the classical case, the subset ~ V cor- 
responding to P is the complement of the subset corresponding to P: any point in 
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the phase space is in one or the other, and the two subsets do not overlap. In the 
quantum case, the subspaces V 1 - and V corresponding to P and P have one ele- 
ment in common, the zero vector. This is different from the classical phase space, 
but is not important, for the zero vector by itself stands for the property which 
is always false, corresponding to the empty subset of the classical phase space. 
Much more significant is the fact that 7i contains many nonzero elements which 
belong neither to V 1 ' nor to V. In particular, the sum of a nonzero vector from 
V 1 - and a nonzero vector from V belongs to H, but does not belong to either of 
these subspaces. For example, the ket |x + ) for a spin-half particle corresponding 
to S x — +1/2 belongs neither to the subspace associated with S z = +1/2 nor to 
that of its negation S z — —1/2. Thus despite the formal parallel, the difference be- 
tween the mathematics of Hilbert space and that of a classical phase space means 
that negation is not quite the same thing in quantum physics as it is in classical 
physics. 


4,5 Conjunction and disjunction (AND, OR) 

Consider two different properties P and Q of a classical system, corresponding to 
subsets V and Q of its phase space. The system will possess both properties simul- 
taneously if its phase point y lies in the intersection V n Q of the sets V and Q or, 
using indicators, if P(y) — I = Q(y ). See the Venn diagram in Fig. 4.1(a). In this 
case we can say that the system possesses the property “P AND Q”, the conjunc- 
tion of P and Q, which can be written compactly as Pa Q. The corresponding 
indicator function is 


P A Q — PQ, (4.43) 

that is, (P A Q)(y) is the function P(y) times the function Q(y). In the case of 
a one-dimensional harmonic oscillator, let P be the property that the energy is less 
than Eq, and Q the property that x lies between x\ and xi. Then the indicator PQ 
for the combined property P A Q, “energy less than Eq AND x between x\ and 
X 2 ”, is 1 at those points in the cross-hatched band in Fig. 2. 1 which lie inside the 
ellipse, and 0 everywhere else. 

Given the close correspondence between classical indicators and quantum pro- 
jectors, one might expect that the projector for the quantum property P A Q (P 
AND Q ) would be the product of the projectors for the separate properties, as in 
(4.43). This is indeed the case if P and Q commute with each other, that is, if 

PQ — QP. (4.44) 

In this case it is easy to show that the product P Q is a projector satisfying the two 
conditions in (3.34). On the other hand, if (4.44) is not satisfied, then PQ will 
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(a) (b) 


Fig. 4.1. The circles represent the properties P and Q. In (a) the grey region is P A Q, 
and in (b) it is P V Q. 


not be a Hermitian operator, so it cannot be a projector. In this section we will 
discuss the conjunction and disjunction of properties P and Q assuming that the 
two projectors commute. The case in which they do not commute is taken up in 
Sec. 4.6. 

As a first example, consider the case of a one-dimensional harmonic oscillator 
in which P is the property that the energy E is less than 3 (in units of hto), and Q 
the property that E is greater than 2. The two projectors are 

p - [0o] + [0l] + [02] . Q = [02] + [03] + [04] H , (4.45) 

and their product is PQ = QP = [</> 2 ], the projector onto the state with energy 
2.5. As this is the only possible energy of the oscillator which is both greater than 
2 and less than 3, the result makes sense. 

As a second example, suppose that the property X corresponds to a quantum 
particle inside (not outside) the interval (4.19), x\ < x < x 2 , and X' to the property 
that the particle is inside the interval 

x[ < x < x' 2 . (4.46) 

In addition, assume that the endpoints of these intervals are in the order 

x\ < x[ < X 2 < x 2 . (4.47) 

For a classical particle, X A X' clearly corresponds to its being inside the interval 

x[<x <x 2 . (4.48) 

In the quantum case, it is easy to show that XX' — X'X is the projector which 
when applied to a wave function 0(x) sets it equal to 0 everywhere outside the in- 
terval (4.48) while leaving it unchanged inside this interval. This result is sensible, 
because if a wave packet lies inside the interval (4.48), it will also be inside both 
of the intervals (4.41) and (4.46). 
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When two projectors P and Q are mutually orthogonal in the sense defined in 
Sec. 3.5, 


PQ = 0=QP (4.49) 

(each equality implies the other), the corresponding properties P and Q are mutu- 
ally exclusive in the sense that if one is true, the other must be false. The reason is 
that the 0 operator which represents the conjunction P A Q corresponds, as does 
the 0 indicator for a classical system, to the property which is always false. Hence 
it is impossible for both P and Q to be true at the same time, for then P A Q would 
be true. As an example, consider the harmonic oscillator discussed earlier, but 
change the definitions so that P is the property E <2 and Q the property E > 3. 
Then PQ = 0, for there is no energy which is both less than 2 and greater than 3. 
Similarly, if the intervals corresponding to X and X' for a particle in one dimen- 
sion do not overlap — e.g., suppose xi < x[ in place of (4.47) — then XX' = 0, 
and if the particle is between x\ and X 2 , it cannot be between x[ and x' 2 . Note 
that this means that a quantum particle, just like its classical counterpart, can never 
be in two places at the same time, contrary to some misleading popularizations of 
quantum theory. 

The disjunction of two properties P and Q, “P OR Q”, where “OR” is under- 
stood in the nonexclusive sense of “P or Q or both”, can be written in the compact 
form P v Q. If P and Q are classical properties corresponding to the subsets V 
and Q of a classical phase space, P v Q corresponds to the union "P U Q of these 
two subsets, see Fig. 4.1(b), and the indicator is given by: 

Pvg = P+e-P(2, (4.50) 


where the final term —PQ on the right makes an appropriate correction at points 
in V n Q where the two subsets overlap, and P + Q = 2. 

The notions of disjunction (OR) and conjunction (AND) are related to each other 
by formulas f ami liar from elementary logic: 


~(P v Q) = P A Q, 
~(P a Q) = P V Q. 
The negation of the first of these yields 

P V Q=^(P a Q), 


(4.51) 


(4.52) 


and one can use this expression along with (4.35) to obtain the right side of (4.50): 
I -[(I -P)(I-Q)] = P + Q-PQ. (4.53) 


Thus if negation and conjunction have already been defined, disjunction does not 
introduce anything that is really new. 
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The preceding remarks also apply to the quantum case. In particular, (4.53) is 
valid if P and Q are projectors. However, P + Q — P Q is a projector if and only 
if PQ — QP. Thus as long as P and Q commute, we can use (4.50) to define the 
projector corresponding to the property P OR Q. There is, however, something to 
be concerned about. Suppose, to take a simple example, P — [0] and Q = [1] 
for a toy model. Then (4.50) gives [0] + [1] for P v Q. However, as pointed 
out earlier, the subspace onto which [ 0 ] + [ 1 ] projects contains kets which do not 
have either the property P or the property Q. Thus [0] + [1] means something 
less definite than [0] or [1]. A satisfactory resolution of this problem requires the 
notion of a quantum Boolean event algebra, which will be introduced in Sec. 5.2. 
In the meantime we will simply adopt (4.50) as a definition of what is meant by 
the quantum projector P v Q when PQ = QP, and leave till later a discussion of 
just how it is to be interpreted. 


4.6 Incompatible properties 

The situation in which two projectors P and Q do not commute with each other, 
PQ 7 ^ QP, has no classical analog, since the product of two indicator functions 
on the classical phase space does not depend upon the order of the factors. Conse- 
quently, classical physics gives no guidance as to how to think about the conjunc- 
tion P A Q (P AND Q) of two quantum properties when their projectors do not 
commute. 

Consider the example of a spin-half particle, let P be the property S x = +1 /2, 
and Q the property that S z — +1/2. The projectors are 

P = [x+], Q = [z + ], (4.54) 

and it is easy to show by direct calculation that [x + ] [z + ] is unequal to [z + ] [x + ], and 
that neither is a projector. Let us suppose that it is nevertheless possible to define a 
property [x + ] A [z + |. To what subspace of the two-dimensional spin space might 
it correspond? Every one-dimensional subspace of the Hilbert space of a spin-half 
particle corresponds to the property S w — +1/2 for some direction w in space, as 
discussed in Sec. 4.2. Thus if [x + ] A \z + 1 were to correspond to a one-dimensional 
subspace, it would have to be associated with such a direction. Clearly the direction 
cannot be x, for S x = +1/2 does not have the property S z = +1/2; see the 
discussion in Sec. 4.2. By similar reasoning it cannot be z, and all other choices 
for w are even worse, because then S w — +1/2 possesses neither the property 
S x — +1/2 nor the property S z = +1/2, much less both of these properties! 

If one-dimensional subspaces are out of the question, what is left? There is 
a two-dimensional “subspace” which is the entire space, with projector I corre- 
sponding to the property which is always true. But given that neither [x + ] nor [z + ] 



4.6 Incompatible properties 


61 


is a property which is always true, it seems ridiculous to suppose that [x + ] A \z + ] 
corresponds to I. There is also the zero-dimensional subspace which contains only 
the zero vector, corresponding to the property which is always false. Does it make 
sense to suppose that [x + | A [z + ]. thought of as a particular property possessed by 
a given spin-half particle at a particular time, is always false in the sense that there 
are no circumstances in which it could be true? If we adopt this proposal we will, 
obviously, also want to say that [x + ] A [z~] is always false. Following the usual 
rules of logic, the disjunction (OR) of two false propositions is false. Therefore, 
the left side of 

([*+] A [Z+B V ([x+] A tz~]) = [* + ] A ([Z+] V [Z~]) - [* + ] A I = [x+| 

(4.55) 

is always false, and thus the right side, the property S x = +1/2, is always false. 
But this makes no sense, for there are circumstances in which S x = +1/2 is true. 

To obtain the first equality in (4.55) requires the use of the distributive identity 

(P A Q) v (P A R) = P A (Q v R) (4.56) 

of standard logic, with P — [jc + ], Q — [z + |, and R — [z~]. One way of avoiding 
the silly result implied by (4.55) is to modify the laws of logic so that the dis- 
tributive law does not hold. In fact, Birkhoff and von Neumann proposed a special 
quantum logic in which (4.56) is no longer valid. Despite a great deal of effort, this 
quantum logic has not turned out to be of much help in understanding quantum 
theory, and we shall not make use of it. 

In conclusion, there seems to be no plausible way to assign a subspace to the 
conjunction [x + ] A \z + 1 of these two properties, or to any other conjunction of 
two properties of a spin-half particle which are represented by noncommuting pro- 
jectors. Such conjunctions are therefore meaningless in the sense that the Hilbert 
space approach to quantum theory, in which properties are associated with sub- 
spaces, cannot assign them a meaning. It is sometimes said that it is impossible to 
measure both S x and S- simultaneously for a spin-half particle. While this state- 
ment is true, it is important to note that the inability to carry out such a measure- 
ment reflects the fact that there is no corresponding property which could be mea- 
sured. How could a measurement tell us, for example, that for a spin-half particle 
S x — +1/2 and S z — +1/2, if the property [x + ] A [z + ] cannot even be defined? 

Guided by the spin-half example, we shall say that two properties P and Q 
of any quantum system are incompatible when their projectors do not commute, 
PQ 7^ QP, and that the conjunction P A Q of incompatible properties is mean- 
ingless in the sense that quantum theory assigns it no meaning. On the other hand, 
if PQ — QP, the properties are compatible, and their conjunction P A Q corre- 
sponds to the projector P Q. 
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To say that P A Q is meaningless when PQ ^ QP is very different from saying 
that it is false. The negation of a false statement is a true statement, so if PAQ 
is false, its negation P v Q, see (4.51), is true. On the other hand, the negation of 
a meaningless statement is equally meaningless. Meaningless statements can also 
occur in ordinary logic. Thus if P and Q are two propositions of an appropriate 
sort, P A Q is meaningful, but P A V Q is meaningless: this last expression cannot 
be true or false, it just doesn’t make any sense. In the quantum case, “P A Q” when 
PQ ^ QP is something like P A V Q in ordinary logic. Books on logic always 
devote some space to the rules for constructing meaningful statements. Physicists 
when reading books on logic tend to skip over the chapters which give these rules, 
because the rules seem intuitively obvious. In quantum theory, on the other hand, 
it is necessary to pay some attention to the rules which separate meaningful and 
meaningless statements, because they are not the same as in classical physics, and 
hence they are not intuitively obvious, at least until one has built up some intuition 
for the quantum world. 

When P and Q are incompatible, it makes no sense to ascribe both properties 
to a single system at the same instant of time. However, this does not exclude the 
possibility that P might be a meaningful (true or false) property at one instant of 
time and Q a meaningful property at a different time. We will discuss the time 
dependence of quantum systems starting in Ch. 7. Similarly, P and Q might refer 
to two distinct physical systems: for example, there is no problem in supposing 
that S x = +1/2 for one spin-half particle, and S- = +1/2 for a different particle. 

At the end of Sec. 4. 1 we stated that if a quantum system is described by a ket | xf) 
which is not an eigenstate of a projector P, then the physical property associated 
with this projector is undefined. The situation can also be discussed in terms of 
incompatible properties, for saying that a quantum system is described by \\f) is 
equivalent to asserting that it has the property [i/r] corresponding to the ray which 
contains \ tf). It is easy to show that the projectors [ f | and P commute if and only 
if \xf) is an eigenstate of P, whereas in all other cases \f\P f P\f ], so they 
represent incompatible properties. 

It is possible for | x/r ) to simultaneously be an eigenstate with eigenvalue 1 of two 
incompatible projectors P and Q. For example, for the toy model of Sec. 4.2, let 

m = |2>, P = \a] + [ 2], <2 = [1] + [2], (4.57) 

where |cr) is defined in (4.10). The definition given in Sec. 4.1 allows us to con- 
clude that the quantum system described by \xf) has the property P, but we could 
equally well conclude that it has the property Q. However, it makes no sense to 
say that it has both properties. Sorting out this issue will require some additional 
concepts found in later chapters. 
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If the conjunction of incompatible properties is meaningless, then so is the 
disjunction of incompatible properties: P V Q (P OR Q) makes no sense if 
PQ 7^ QP. This follows at once from (4.52), because if P and Q are in- 
compatible, so are their negations P and Q, as can be seen by multiplying out 
(7 — P)(I — Q) and comparing it with (7 — Q)(I — P). Hence P A Q is 
meaningless, and so is its negation. Other sorts of logical comparisons, such 
as the exclusive OR (XOR), are also not possible in the case of incompatible 
properties. 

If PQ QP, the question “Does the system have property P or does it have 
property QT makes no sense if understood in a way which requires a comparison 
of these two incompatible properties. Thus one answer might be, “The system has 
property P but it does not have property Q”. This is equivalent to affirming the 
truth of P and the falsity of Q, so that P and Q are simultaneously true. But since 
PQ ^ QP, this makes no sense. Another answer might be that “The system has 
both properties P and Q”, but the assertion that P and Q are simultaneously true 
also does not make sense. And a question to which one cannot give a meaningful 
answer is not a meaningful question. 

In the case of a spin-half particle it does not make sense to ask whether S x = 
+ 1/2 or S z — +1/2, since the corresponding projectors do not commute with each 
other. This may seem surprising, since it is possible to set up a device which will 
produce spin-half particles with a definite polarization, S w — +1/2, where u; is a 
direction determined by some property or setting of the device. (This could, for 
example, be the direction of the magnetic field gradient in a Stem-Gerlach appa- 
ratus, Sec. 17.2.) In such a case one can certainly ask whether the setting of the 
device is such as to produce particles with S x = +1/2 or with S z = +1/2. How- 
ever, the values of components of spin angular momentum for a particle polarized 
by this device are then properties dependent upon properties of the device in the 
sense described in Ch. 14, and can only sensibly be discussed with reference to the 
device. 

Along with different components of spin for a spin-half particle, it is easy to 
find many other examples of incompatible properties of quantum systems. Thus 
the projectors X and P in Sec. 4.3, for the position of a particle to lie between xi 
and X 2 and its momentum between p i and pi, respectively, do not commute with 
each other. In the case of a harmonic oscillator, neither X nor P commutes with 
projectors, such as [</>o] + \<pi I, which define a range for the energy. That quan- 
tum operators, including the projectors which represent quantum properties, do not 
always commute with each other is a consequence of employing the mathemati- 
cal structure of a quantum Hilbert space rather than that of a classical phase space. 
Consequently, there is no way to get around the fact that quantum properties cannot 
always be thought of in the same way as classical properties. Instead, one has to 
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pay attention to the rules for combining them if one wants to avoid inconsistencies 
and paradoxes. 
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Probabilities and physical variables 


5.1 Classical sample space and event algebra 

Probability theory is based upon the concept of a sample space of mutually ex- 
clusive possibilities, one and only one of which actually occurs, or is true, in any 
given situation. The elements of the sample space are sometimes called points or 
elements or events. In classical and quantum mechanics the sample space usually 
consists of various possible states or properties of some physical system. For ex- 
ample, if a coin is tossed, there are two possible outcomes: H (heads) or T (tails), 
and the sample space S is {H, T\. If a die is rolled, the sample space S consists 
of six possible outcomes: 5 = 1, 2, 3, 4, 5, 6. If two individuals A and B share an 
office, the occupancy sample space consists of four possibilities: an empty office, 
A present, B present, or both A and B present. 

Associated with a sample space S is an event algebra B consisting of subsets 
of elements of the sample space. In the case of a die, “s is even” is an event in 
the event algebra. So are “s is odd”, “s is less than 4”, and “s is equal to 2.” It 
is sometimes useful to distinguish events which are elements of the sample space, 
such as 5 = 2 in the previous example, and those which correspond to more than 
one element of the sample space, such as “s is even”. We shall refer to the former 
as elementary events and to the latter as compound events. If the sample space S 
is finite and contains n points, the event algebra contains 2" possibilities, including 
the entire sample space S considered as a single compound event, and the empty set 
0. For various technical reasons it is convenient to include 0, even though it never 
actually occurs: it is the event which is always false. Similarly, the compound event 
S, the set of all elements in the sample space, is always true. The subsets of S form 
a Boolean algebra or Boolean lattice B under the usual set-theoretic relationships: 
The complement ~ £ of a subset £ of S is the set of elements of S which are not 
in £. The intersection £ D IF of two subsets is the collection of elements they have 
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in common, whereas their union £ U T is the collection of elements belonging to 
one or the other, or possibly both. 

The phase space of a classical mechanical system is a sample space, since one 
and only one point in this space represents the actual state of the system at a partic- 
ular time. Since this space contains an uncountably infinite number of points, one 
usually defines the event algebra not as the collection of all subsets of points in the 
phase space, but as some more manageable collection, such as the Borel sets. 

A useful analogy with quantum theory is provided by a coarse graining of the 
classical phase space, a finite or countably infinite collection of nonoverlapping 
regions or cells which together cover the phase space. These cells, which in the 
notation of Ch. 4 represent properties of the physical system, constitute a sample 
space S of mutually exclusive possibilities, since a point y in the phase space 
representing the state of the system at a particular time will be in one and only 
one cell, making this cell a true property, whereas the properties corresponding to 
all of the other cells in the sample are false. (Note that individual points in the 
phase space are not, in and of themselves, members of S.) The event algebra B 
associated with this coarse graining consists of collections of one or more cells 
from the sample space, along with the empty set and the collection of all the cells. 
Each event in B is associated with a physical property corresponding to the set of 
all points in the phase space lying in one of the cells making up the (in general 
compound) event. The negation of an event £ is the collection of cells which are 
in S but not in £, the conjunction of two events £ and T is the collection of cells 
which they have in common, and their disjunction the collection of cells belonging 
to £ or to T or to both. 

As an example, consider a one-dimensional harmonic oscillator whose phase 
space is the x, p plane. One possible coarse graining consists of the four cells 

x > 0, p > 0; x < 0, p > 0; x > 0, p < 0; x < 0, p < 0; (5.1) 

that is, the four quadrants defined so as not to overlap. Another coarse graining is 
the collection [C n ], n — 1, 2, . . . of cells 

C n : (n — \)Eq < E < nE 0 (5.2) 

defined in terms of the energy E, where Eq > 0 is some constant. Still another 
coarse graining consists of the rectangles 

D mn : mx o < x < (m + l)x 0 , np 0 < p < (n + 1 )p 0 , (5.3) 

where xo > 0, po > 0 are constants, and m and n are any integers. 

As in Sec. 4.1, we define the indicator or indicator function E for an event £ to 
be the function on the sample space which takes the value 1 space which is in the 
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set £, and 0 (false) on all other elements: 


I I for s € £, 
0 otherwise. 


(5.4) 


The indicators form an algebra under the operations of negation (~ E), conjunction 
( E A F), and disjunction (E v F), as discussed in Secs. 4.4 and 4.5: 

-£ = £ = /-£, 

E A F — EF, (5.5) 

Ev F = E + F - EF, 


where the arguments of the indicators have been omitted; one could also write 
(E A F)(s) — E(s)F(s), etc. Obviously E A F and E v F are the counterparts of 
£ fl 5F and £ U T for the corresponding subsets of S. We shall use the terms “event 
algebra” and “Boolean algebra” for either the algebra of sets or the corresponding 
algebra of indicators. 

Associated with each element r of a sample space is a special indicator P, which 
is zero except at the point r: 


PrM = 


1 if s — r, 
0 if sjzr. 


(5.6) 


Indicators of this type will be called elementary or minimal, and it is easy to see 
that 


P r P s = 8 rs P s . (5.7) 

The vanishing of the product of two elementary indicators associated with dis- 
tinct elements of the sample space reflects the fact that these events are mutually- 
exclusive possibilities: if one of them occurs (is true), the other cannot occur (is 
false), since the zero indicator denotes the “event” which never occurs (is always 
false). An indicator R on the sample space corresponding to the (in general com- 
pound) event 1Z can be written as a sum of elementary indicators, 

R = J2^sP s , (5.8) 

seS 

where jv s is equal to 1 if 5 is in 1Z, and 0 otherwise. The indicator I, which takes 
the value 1 everywhere, can be written as 

i = J2 p ” ^ 

seS 

which is (5.8) with n s = 1 for every 
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5.2 Quantum sample space and event algebra 

In Sec. 3.5 a decomposition of the identity was defined to be an orthogonal collec- 
tion of projectors { Pj }, 

PjP k = 8 jk Pj, (5.10) 

which sum to the identity 

i = J2 p j- (5-H) 

i 

Any decomposition of the identity of a quantum Hilbert space Tt can be thought 
of as a quantum sample space of mutually-exclusive properties associated with the 
projectors or with the corresponding subspaces. That the properties are mutually 
exclusive follows from (5.10), see the discussion in Sec. 4.5, which is the quantum 
counterpart of (5.7). The fact that the projectors sum to I is the counterpart of 
(5.9), and expresses the fact that one of these properties must be true. Thus the 
usual requirement that a sample space consist of a collection of mutually-exclusive 
possibilities, one and only one of which is correct, is satisfied by a quantum de- 
composition of the identity. 

The quantum event algebra B corresponding to the sample space (5.11) consists 
of all projectors of the form 

R = J2 7T J P J' (5.12) 

i 

where each jtj is either 0 or 1; note the analogy with (5.8). Setting all the nj equal 
to 0 yields the zero operator 0 corresponding to the property that is always false; 
setting them all equal to 1 yields the identity I, which is always true. If there are 
n elements in the sample space, there are 2" elements in the event algebra, just as 
in ordinary probability theory. The elementary or minimal elements of B are the 
projectors {Pj} which belong to the sample space, whereas the compound elements 
are those for which two or more of the nj in (5.12) are equal to 1. 

Since the different projectors which make up the sample space commute with 
each other, (5.10), so do all projectors of the form (5.12). And because of (5.10), 
the projectors which make up the event algebra B form a Boolean algebra or 
Boolean lattice under the operations of D and U interpreted as A and V; see (5.5), 
which applies equally to classical indicators and quantum projectors. Any collec- 
tion of commuting projectors forms a Boolean algebra provided the negation P 
of any projector P in the collection is also in the collection, and the product P Q 
(— QP ) of two elements in the collection is also in the collection. (Because of 
(4.50), these rules ensure that P v Q is also a member of the collection, so this 
does not have to be stated as a separate rule.) Note that a Boolean algebra of 
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projectors is a much simpler object (in algebraic terms) than the noncommutative 
algebra of all operators on the Hilbert space. 

A trivial decomposition of the identity contains just one projector, 7; nontrivial 
decompositions contain two or more projectors. For a spin-half particle, the only 
nontrivial decompositions of the identity are of the form 

7 = [ w + ] + [w~], (5.13) 

where w is some direction in space, such as the x axis or the z axis. Thus the 
sample space consists of two elements, one corresponding to S w = +1/2 and one 
to S w = —1/2. These are mutually-exclusive possibilities: one and only one can 
be a correct description of this component of spin angular momentum. The event 
algebra B consists of the four elements: 0, 7, [ uj + |, and [u> - ]. 

Next consider a toy model, Sec. 2.5, in which a particle can be located at one of 
M — 3 sites, m = —1, 0, 1. The three kets | — 1), |0), |1) form an orthonormal basis 
of the Hilbert space. A decomposition of the identity appropriate for discussing the 
particle’s position contains the three projectors 

[-1], [0], [1] (5.14) 

corresponding to the property that the particle is at m = — 1 , m = 0, and m — 1 , 
respectively. The Boolean event algebra has 2 3 = 8 elements: 0, 7, the three 
projectors in (5.14), and three projectors 

[— 1] + [0], [0] + [1], [-1] + [1] (5.15) 

corresponding to compound events. An alternative decomposition of the identity 
for the same Hilbert space consists of the two projectors 

[-1], [0] + [l], (5.16) 

which generate an event algebra with only 2 2 = 4 elements: the projectors in (5.16) 
along with 0 and 7. 

Although the same projector [0] + [1] occurs both in (5.15) and in (5.16), its 
physical interpretation or meaning in the two cases is actually somewhat different, 
and discussing the difference will throw light upon the issue raised at the end of 
Sec. 4.5 about the meaning of a quantum disjunction P v Q. In (5.15), [0] + [1] 
represents a compound event whose physical interpretation is that the particle is at 
m = 0 or at m — 1, in much the same way that the compound event {3, 4} in the 
case of a die would be interpreted to mean that either 5 = 3 or 5 = 4 spots turned 
up. On the other hand, in (5.16) the projector [0] + [1] represents an elementary 
event which cannot be thought of as the disjunction of two different possibilities. 
In quantum mechanics, each Boolean event algebra constitutes what is in effect a 
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“language” out of which one can construct a quantum description of some physi- 
cal system, and a fundamental rule of quantum theory is that a description (which 
may but need not be couched in terms of probabilities) referring to a single sys- 
tem at a single time must be constructed using a single Boolean algebra, a single 
“language”. (This is a particular case of a more general “single-framework rule” 
which will be introduced later on, and discussed in some detail in Ch. 16.) The 
language based on (5.14) contains among its elementary constituents the projector 

[0] and the projector [1], and its grammatical rules allow one to combine such el- 
ements with “and” and “or” in a meaningful way. Hence in this language “[0] or 

[1] ” makes sense, and it is convenient to represent it using the projector [0] + [1] in 
(5. 15). On the other hand, the language based on (5.16) contains neither [0] nor [1] 
— they are not in the sample space, nor are they among the four elements which 
constitute its Boolean algebra. Consequently, in this somewhat impoverished lan- 
guage it is impossible to express the idea “[0] or [1]”, because both [0] and [1] are 
meaningless constructs. 

The reader may be tempted to dismiss all of this as needless nitpicking better 
suited to mathematicians and philosophers than to physical scientists. Is it not ob- 
vious that one can always replace the impoverished language based upon (5.16) 
with the richer language based upon (5.14), and avoid all this quibbling? The 
answer is that one can, indeed, replace (5.16) with (5.14) in appropriate circum- 
stances; the process of doing so is known as “refinement”, and will be discussed in 
Sec. 5.3. However, in quantum theory there can be many different refinements. In 
particular, a second and rather different refinement of (5.16) will be found in (5. 19). 
Because of the multiple possibilities for refinement, one must pay attention to what 
one is doing, and it is especially important to be explicit about the sample space 
(“language”) that one is using. Shortcuts in reasoning which never cause difficulty 
in classical physics can lead to enormous headaches in quantum theory, and avoid- 
ing these requires that one take into account the rules which govern meaningful 
quantum descriptions. 

As an example of a sample space associated with a continuous quantum system, 
consider the decomposition of the identity 

/ = I>I (5.17) 

corresponding to the energy eigenstates of a quantum harmonic oscillator, in the 
notation of Sec. 4.3. The elementary event [(/)„] can be interpreted as the energy 
having the value n + 1/2 in units of hco. These events are mutually-exclusive 
possibilities: if the energy is 3.5, it cannot be 0.5 or 2.5, etc. The projector \<p 2 1 + 
1 03 1 in the Boolean algebra generated by (5.17) means that the energy is equal to 
2.5 or 3.5. If, on the other hand, one were to replace (5.17) with an alternative 
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decomposition of the identity consisting of the projectors {([02m I + [ <p2m+ 1 1 ) . m = 
0, 1, 2, . . . }, each projecting onto a two-dimensional subspace of 7i, [0 2 ] + [ 03 ] 
could not be interpreted as an energy equal to 2.5 or 3.5, since states without a well- 
defined energy are also present in the corresponding subspace. See the preceding 
discussion of the toy model. 


5.3 Refinement, coarsening, and compatibility 

Suppose there are two decompositions of the identity, £ — {Ej} and T = { F k }, 
with the property that each F k can be written as a sum of one of more of the Ej . In 
such a case we will say that the decomposition £ is a refinement of 5F, or £ is finer 
than J-, or £ is obtained by refining T. Equivalently, T is a coarsening of £, is 
coarser than £, and is obtained by coarsening £. For example, the decomposition 

(5.14) is a refinement of (5.16) obtained by replacing the single projector [0] + [1] 
in the latter with the two projectors [0] and [1]. 

According to this definition, any decomposition of the identity is its own refine- 
ment (or coarsening), and it is convenient to allow the possibility of such a trivial 
refinement (or coarsening). If the two decompositions are actually different, one is 
a nontrivial or proper refinement/coarsening of the other. An ultimate decomposi- 
tion of the identity is one in which each projector projects onto a one-dimensional 
subspace, so no further refinement (of a nontrivial sort) is possible. Thus (5.13), 

(5.14) , and (5.17) are ultimate decompositions, whereas (5.16) is not. 

Two or more decompositions of the identity are said to be (mutually) compatible 
provided they have a common refinement, that is, provided there is a single decom- 
position 1Z which is finer than each of the decompositions under consideration. 
When no common refinement exists the decompositions are said to be (mutually) 
incompatible. If £ is a refinement of T, the two are obviously compatible, because 
£ is itself the common refinement. 

The toy model with M = 3 considered in Sec. 5.2 provides various examples of 
compatible and incompatible decompositions of the identity. The decomposition 

([— 1] + [0]), [1] (5.18) 

is compatible with (5.16) because (5.14) is a common refinement. The decomposi- 
tion 


[-1], [p], [<?], (5.19) 

where the projectors \p\ and \q | correspond to the kets 

\P) = (|0> + 1 1))/ a/2, I q) = (|0> - 1 1»/V2, (5.20) 

is a refinement of (5.16), as is (5.14), so both (5.14) and (5.19) are compatible 
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with (5.16). However, (5.14) and (5.19) are incompatible with each other: since 
each is an ultimate decomposition, and they are not identical, there is no common 
refinement. In addition, (5.19) is incompatible with (5.18), though this is not quite 
so obvious. As another example, the two decompositions 

/ = [*+] + [*-], / = [ z +] + [ z -] (5.21) 

for a spin-half particle are incompatible, because each is an ultimate decomposi- 
tion, and they are not identical. 

If £ and IF are compatible, then each projector Ej can be written as a combina- 
tion of projectors from the common refinement 1Z, and the same is true of each F k . 
That is to say, the projectors {Ej} and { F k ) belong to the Boolean event algebra 
generated by 7 Z. As all the operators in this algebra commute with each other, it 
follows that every projector Ej commutes with every projector F k . Conversely, if 
every Ej in £ commutes with every F k in F, there is a common refinement: all 
nonzero projectors of the form { Ej F k } constitute the decomposition generated by 
£ and F, and it is the coarsest common refinement of £ and F. The same argument 
can be extended to a larger collection of decompositions, and leads to the general 
rule that decompositions of the identity are mutually compatible if and only if all 
the projectors belonging to all of the decompositions commute with each other. If 
any pair of projectors fail to commute, the decompositions are incompatible. Using 
this rule it is immediately evident that the decompositions in (5.16) and (5.18) are 
compatible, whereas those in (5.18) and (5.19) are incompatible. The two decom- 
positions in (5.21) are incompatible, as are any two decompositions of the identity 
of the form (5.13) if they correspond to two directions in space that are neither the 
same nor opposite to each other. Since it arises from projectors failing to commute 
with each other, incompatibility is a feature of the quantum world with no close 
analog in classical physics. Different sample spaces associated with a single clas- 
sical system are always compatible, they always possess a common refinement. 
For example, a common refinement of two coarse grainings of a classical phase 
space is easily constructed using the nonempty intersections of cells taken from 
the two sample spaces. 

As noted in Sec. 5.2, a fundamental rule of quantum theory is that a descrip- 
tion of a particular quantum system must be based upon a single sample space or 
decomposition of the identity. If one wants to use two or more compatible sam- 
ple spaces, this rule can be satisfied by employing a common refinement, since its 
Boolean algebra will include the projectors associated with the individual spaces. 
On the other hand, trying to combine descriptions based upon two (or more) incom- 
patible sample spaces can lead to serious mistakes. Consider, for example, the two 
incompatible decompositions in (5.21). Using the first, one can conclude that for a 
spin-half particle, either S x = +1/2 or S x = —1/2. Similarly, by using the second 
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one can conclude that either S z = +1/2 or else S z = —1/2. However, combining 
these in a manner which would be perfectly correct for a classical spinning object 
leads to the conclusion that one of the four possibilities 

S x - +1/2 A S z = +1/2, S x m +1/2 A S z = -1/2, 

S x - -1/2 A S z = +1/2, S x = -1/2 A S z = -1/2 

must be a correct description of the particle. But in fact all four possibilities are 
meaningless, as discussed previously in Sec. 4.6, because none of them corre- 
sponds to a subspace of the quantum Hilbert space. 


5.4 Probabilities and ensembles 

Given a sample space, a probability distribution assigns a nonnegative number or 
probability p s , also written Pr(s), to each point s of the sample space in such a 
way that these numbers sum to 1. For example, in the case of a six-sided die, one 
often assigns equal probabilities to each of the six possibilities for the number of 
spots s\ thus p s = 1/6. However, this assignment is not a fundamental law of 
probability theory, and there exist dice for which a different set of probabilities 
would be more appropriate. Each compound event E in the event algebra is as- 
signed a probability Pr(.E) equal to the sum of the probabilities of the elements 
of the sample space which it contains. Thus “5 is even” in the case of a die is 
assigned a probability pi + P 4 + p&, which is 1/2 if each p s is 1/6. The assign- 
ment of probabilities in the case of continuous variables, e.g., a classical phase 
space, can be quite a bit more complicated. However, the simpler discrete case 
will be quite adequate for this book; we will not need sophisticated concepts from 
measure theory. 

Along with a formal definition, one needs an intuitive idea of the meaning of 
probabilities. One approach is to imagine an ensemble : a collection of N nomi- 
nally identical systems, where iV is a very large number, with each system in one 
of the states which make up the sample space S, and with the fraction of mem- 
bers of the ensemble in state s given by the corresponding probability p s . For 
example, the ensemble could be a large number of dice, each displaying a certain 
number of spots, with 1 /6 of the members of the ensemble displaying one spot, 
1 /6 displaying two spots, etc. One should always think of N as such a large num- 
ber that p s N is also very large for any p s that is greater than 0, to get around 
any concerns about whether the fraction of systems in state s is precisely equal 
to p s . One says that the probability that a single system chosen at random from 
such an ensemble is in state 5 is given by p s . Of course, any particular system 
will be in some definite state, but this state is not known before the system is se- 
lected from the ensemble. Thus the probability represents “partial information” 
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about a system when its actual state is not known. For example, if the probability 
of some state is close to 1, one can be fairly confident, but not absolutely cer- 
tain, that a system chosen at random will be in this state and not in some other 
state. 

Rather than imagining the ensemble to be a large collection of systems, it is 
sometimes useful to think of it as made up of the outcomes of a large number of 
experiments carried out at successive times, with care being taken to ensure that 
these are independent in the sense that the outcome of any one experiment is not 
allowed to influence the outcome of later experiments. Thus instead of a large 
collection of dice, one can think of a single die which is rolled a large number of 
times. The fraction of experiments leading to the result s is then the probability p s . 
The outcome of any particular experiment in the sequence is not known in advance, 
but a knowledge of the probabilities provides partial information. 

Probability theory as a mathematical discipline does not tell one how to choose 
a probability distribution. Probabilities are sometimes obtained directly from ex- 
perimental data. In other cases, such as the Boltzmann distribution for systems in 
thermal equilibrium, the probabilities express well-established physical laws. In 
some cases they simply represent a guess. Later we shall see how to use the dy- 
namical laws of quantum theory to calculate various quantum probabilities. The 
true meaning of probabilities is a subject about which there continue to be disputes, 
especially among philosophers. These arguments need not concern us, for prob- 
abilities in quantum theory, when properly employed with a well-defined sample 
space, obey the same rules as in classical physics. Thus the situation in quantum 
physics is no worse (or better) than in the everyday classical world. 

Conditional probabilities play a fundamental role in probabilistic reasoning and 
in the application of probability theory to quantum mechanics. Let A and B be two 
events, and suppose that Pr(fi) > 0. The conditional probability of A given B is 
defined to be 


Pr(A | B) — Pr(A A B)/Pr(B), (5.23) 

where AaB is the event “A AND B” represented by the product AB of the classical 
indicators, or of the quantum projectors. Hence one can also write Pr(AB) in place 
of Pr(A A B). The intuitive idea of a conditional probability can be expressed in 
the following way. Given an ensemble, consider only those members in which B 
occurs (is true). These comprise a subensemble of the original ensemble, and in 
this subensemble the fraction of systems with property A is given by Pr(A | B) 
rather than by Pr(A), as in the original ensemble. For example, in the case of a 
die, let B be the property that 5 is even, and A the property s < 3. Assuming 
equal probabilities for all outcomes, Pr(A) = 1/2. However, Pr(A | B) — 1/3, 
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corresponding to the fact that of the three possibilities s = 2, 4, 6 which constitute 
the compound event B, only one is less than or equal to 3. 

If B is held fixed, Pr(A | B) as a function of its first argument A behaves like 
an “ordinary” probability distribution. For example, if we use 5 to indicate points 
in the sample space, the numbers Pr(s | B) are nonnegative, and J2 S P r ^ s ' I #) = 
1. One can think of Pr(A | B) with B fixed as obtained by setting to zero the 
probabilities of all elements of the sample space for which B is false (does not 
occur), and multiplying the probabilities of those elements for which B is true 
by a common factor, l/Pr(B), to renorm a li z e them, so that the probabilities of 
mutually-exclusive sets of events sum to one. That this is a reasonable procedure 
is evident if one imagines an ensemble and thinks about the subensemble of cases 
in which B occurs. It makes no sense to define a probability conditioned on B if 
Pr(B) = 0, as there is no way to renorm a li z e zero probability by multiplying it by 
a constant in order to get something finite. 

In the case of quantum systems, once an appropriate sample space has been de- 
fined the rules for manipulating probabilities are precisely the same as for any other 
(“classical”) probabilities. The probabilities must be nonnegative, they must sum 
to 1, and conditional probabilities are defined in precisely the manner discussed 
above. Sometimes it seems as if quantum probabilities obey different rules from 
what one is accustomed to in classical physics. The reason is that quantum the- 
ory allows a multiplicity of sample spaces, that is, decompositions of the identity, 
which are often incompatible with one another. In classical physics a single sam- 
ple space is usually sufficient, and in cases in which one may want to use more 
than one, for example alternative coarse grainings of the phase space, the different 
possibilities are always compatible with each other. However, in quantum theory 
different sample spaces are generally incompatible with one another, so one has 
to learn how to choose the correct sample space needed for discussing a particular 
physical problem, and how to avoid carelessly combining results from incompati- 
ble sample spaces. Thus the difficulties one encounters in quantum mechanics have 
to do with choosing a sample space. Once the sample space has been specified, the 
quantum rules are the same as the classical rules. 

There have been, and no doubt will continue to be, a number of proposals for 
introducing special “quantum probabilities” with properties which violate the usual 
rules of probability theory: probabilities which are negative numbers, or complex 
numbers, or which are not tied to a Boolean algebra of projectors, etc. Thus far, 
none of these proposals has proven helpful in untangling the conceptual difficulties 
of quantum theory. Perhaps someday the situation will change, but until then there 
seems to be no reason to abandon standard probability theory, a mode of reasoning 
which is quite well understood, both formally and intuitively, and replace it with 
some scheme which is deficient in one or both of these respects. 
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5.5 Random variables and physical variables 

In ordinary probability theory a random variable is a real-valued function V de- 
fined everywhere on the sample space. For example, if s is the number of spots 
when a die is rolled, V (5) = 5 is an example of a random variable, as is V ( s ) = 
s 2 / 6. For coin tossing, V (H) = +1/2, V (T) = — 1/2 is an example of a random 
variable. 

If one regards the x, p phase plane for a particle in one dimension as a sample 
space, then any real- valued function V(x, p) is a random variable. Examples of 
physical interest include the position, the momentum, the kinetic energy, the po- 
tential energy, and the total energy. For a particle in three dimensions the various 
components of angular momentum relative to some origin are also examples of 
random variables. 

In classical mechanics the term physical variable is probably more descriptive 
than “random variable” when referring to a function defined on the phase space, 
and we shall use it for both classical and quantum systems. However, thinking of 
physical variables as random variables, that is, as functions defined on a sample 
space, is particularly helpful in understanding what they mean in quantum theory. 

The quantum counterpart of the function V representing a physical variable in 
classical mechanics is a Hermitian or self-adjoint operator V = V f on the Hilbert 
space. Thus position, energy, angular momentum, and the like all correspond to 
specific quantum operators. Generalizing from this, we shall think of any self- 
adjoint operator on the Hilbert space as representing some (not necessarily very 
interesting) physical variable. A quantum physical variable is often called an ob- 
servable. While this term is not ideal, given its association with somewhat con- 
fused and contradictory ideas about quantum measurements, it is widely used in 
the literature, and in this book we shall employ it to refer to any quantum physical 
variable, that is, to any self-adjoint operator on the quantum Hilbert space, without 
reference to whether it could, in practice or in principle, be measured. 

To see how self-adjoint operators can be thought of as random variables in the 
sense of probability theory, one can make use of a fact discussed in Sec. 3.7: if 
V — V', then there is a unique decomposition of the identity { Pj }, determined by 
the operator V, such that, see (3.75), 

V = Y 2 v 'j p j’ (524 ) 

j 

where the u' are eigenvalues of V, and v'- / v' k for j ^ k. Since any decomposition 
of the identity can be regarded as a quantum sample space, one can think of the 
collection {P ; } as the “natural” sample space for the physical variable or operator 
V. On this sample space the operator V behaves very much like a real-valued 
function: to Pi it assigns the value v[, to P 2 the value v' 2 , and so forth. That 
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(5.24) can be interpreted in this way is suggested by the fact that for a discrete 
sample space S, an ordinary random variable V can always be written as a sum of 
numbers times the elementary indicators defined in (5.6), 

V(s) = ^iyP r (s), (5.25) 

where v r — V (r). Since quantum projectors are analogous to classical indicators, 
and the indicators on the right side of (5.25) are associated with the different ele- 
ments of the sample space, there is an obvious and close analogy between (5.24) 
and (5.25). 

The only possible values for a quantum observable V are the eigenvalues v'j in 
(5.24) or, equivalently, the vj in (5.32), just as the only possible values of a classical 
random variable are the v r in (5.25). In order for a quantum system to possess the 
value v for the observable V, the property “V — v” must be true, and this means 
that the system is in an eigenstate of V. That is to say, the quantum system is 
described by a nonzero ket | \jr) such that 

V\f) = v\ xlr), (5.26) 

or, more generally, by a nonzero projector Q such that 

VQ = vQ. (5.27) 

In order for (5.27) to hold for a projector Q onto a space of dimension 2 or more, 
the eigenvalue v must be degenerate, and if v — v'j, then 

PjQ = Q, (5.28) 

where Pj is the projector in (5.24) corresponding to v'j. 

Let us consider some examples, beginning with a one-dimensional harmonic 
oscillator. Its (total) energy corresponds to the Hamiltonian operator H, which can 
be written in the form 

H = '^(n + l/2)ha>[4b t \, (5.29) 

where the corresponding decomposition of the energy was introduced in (5.17). 
The Hamiltonian can thus be thought of as a function which assigns to the projector 
[<p n ], or to the subspace of multiples of \4> n ), the energy (n + 1/2 )hco. In the case 
of a spin-half particle the operator for the z component of spin angular momentum 
divided by h is 


5, = +|[z + ] - £{ Z~]. (5.30) 

It assigns to [z + ] the value +1/2, and to [z“] the value —1/2. Next t hink of a toy 
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model in which the sites are labeled by an integer m, and suppose that the distance 
between adjacent sites is the length b. Then the position operator will be given by 

B = ^ m b[m\. (5.31) 

The position operator x for a “real” quantum particle in one dimension is a com- 
plicated object, and writing it in a form equivalent to (5.24) requires replacing the 
sum with an integral, using mathematics which is outside the scope of this book. 

In all the examples considered thus far, the Pj are projectors onto one-dimen- 
sional subspaces, so they can be written as dyads, and (5.24) is equivalent to writing 

^ = = I l v jW> < 5 - 32 ) 

j j 

where the eigenvalues in (5.32) are identical to those in (5.24), except that the sub- 
script labels may be different. As discussed in Sec. 3.7, (5.24) and (5.32) will be 
different if one or more of the eigenvalues of V are degenerate, that is, if a partic- 
ular eigenvalue occurs more than once on the right side of (5.32). For instance, the 
energy eigenvalues of atoms are often degenerate due to spherical symmetry, and 
in this case the projector Pj for the / th energy level projects onto a space whose 
dimension is equal to the multiplicity (or degeneracy) of the level. When such 
degeneracies occur, it is possible to construct nontrivial refinements of the decom- 
position {Pj} in the sense discussed in Sec. 5.3, by writing one or more of the Pj 
as a sum of two or more nonzero projectors. If {Qk} is such a refinement, it is 
obviously possible to write 

V = (5.33) 

k 

where the extra prime allows the eigenvalues in (5.33) to carry different subscripts 
from those in (5.24). One can again think of V as a random variable, that is, a 
function, on the finer sample space {Qk}- Note that when it is possible to refine 
a quantum sample space in this manner, it is always possible to refine it in many 
different ways which are mutually incompatible. Whereas any one of these sample 
spaces is perfectly acceptable so far as the physical variable V is concerned, one 
will make mistakes if one tries to combine two or more incompatible sample spaces 
in order to describe a single physical system; see the comments in Sec. 5.3. 

On the other hand, V cannot be defined as a physical (“random”) variable on a 
decomposition which is coarser than {Pj}, since one cannot assign two different 
eigenvalues to the same projector or subspace. (To be sure, one might define a 
“coarse” version of V, but that would be a different physical variable.) Nor can V 
be defined as a physical or random variable on a decomposition which is incom- 
patible with {Pj}, in the sense discussed in Sec. 5.3. It may, of course, be possible 
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to approximate V with an operator which is a function on an alternative decompo- 
sition, but such approximations are outside the scope of the present discussion. 


5.6 Averages 

The average (V) of a random variable V (s) on a sample space S is defined by the 
formula 

(V) = J2p* v ( s )- (5.34) 

seS 

That is, the probabilities are used to weight the values of V at the different sample 
points before adding them together. One way to justify the weights in (5.34) is 
to imagine an ensemble consisting of a very large number N of systems. If V is 
evaluated for each system, and the results are then added together and divided by 
N, the outcome will be (5.34), because the fraction of systems in the ensemble in 
state 5 is equal to p s . 

Random variables form a real linear space in the sense that if U (s) and V (s) are 
two random variables, so is the linear combination 

uU(s) + vV(s), (5.35) 

where u and v are real numbers. The average operation ( ) defined in (5.34) is a 
linear functional on this space, since 

(uU(s) + vV(s)) = u{U) + v(V). (5.36) 

Another property of ( } is that when it is applied to a positive random variable 
W (s) > 0, the result cannot be negative: 

(W) > 0. (5.37) 

In addition, the average of the identity is 1, 

(I) = 1, (5.38) 

because the probabilities {p s } sum to 1. 

The linear functional ( ) is obviously determined once the probabilities {/?^} are 
given. Conversely, a functional ( ) defined on the linear space of random variables 
determines a unique probability distribution, since one can use averages of the 
elementary indicators in (5.6), 

Ps = (Ps), (5.39) 

in order to define positive probabilities which sum to 1 in view of (5.9) and (5.38). 
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In a similar way, the probability of a compound event A is equal to the average of 
its indicator: 


Pr(A) = (A). (5.40) 

Averages for quantum mechanical physical (random) variables follow precisely 
the same rules; the only differences are in notation. One starts with a sample space 
{Pj} of projectors which sum to 7, and a set of nonnegative probabilities [pj] 
which sum to 1 . A random variable on this space is a Hermitian operator which 
can be written in the form 

V = Y^ v jPj, (5-41) 

j 

where the different eigenvalues appearing in the sum need not be distinct. That is, 
the sample space could be either the “natural” space associated with the operator 
V as discussed in Sec. 5.5, or some refinement. The average 

<V) = 'Epm (5.42) 

j 

is formally equivalent to (5.34). 

A probability distribution on a given sample space can only be used to calculate 
averages of random variables defined on this sample space; it cannot be used, at 
least directly, to calculate averages of random variables which are defined on some 
other sample space. While this is rather obvious in ordinary probability theory, its 
quantum counterpart is sometimes overlooked. In particular, the probability distri- 
bution associated with {Pj} cannot be used to calculate the average of a self-adjoint 
operator S whose natural sample space is a decomposition { Q k ] incompatible with 
{Pj}. Instead one must use a probability distribution for the decomposition { Q *}. 
An alternative way of writing (5.41) is the following. The positive operator 

P = Y^PjPj/Tt{Pj) (5.43) 

j 

has a trace equal to 1, so it is a density matrix, as defined in Sec. 3.9. It is easy to 
show that 


<V> = Tr(pV) (5.44) 

by applying the orthogonality conditions (5.10) to the product pV . Note that p and 
V commute with each other. The formula (5.44) is sometimes used in situations in 
which p and V do not commute with each other. In such a case p is functioning as 
a pre-probability, as will be explained in Ch. 15. 
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6.1 Introduction 

A composite system is one involving more than one particle, or a particle with in- 
ternal degrees of freedom in addition to its center of mass. In classical mechanics 
the phase space of a composite system is a Cartesian product of the phase spaces 
of its constituents. The Cartesian product of two sets A and B is the set of (or- 
dered) pairs {(a, b )}, where a is any element of A and b is any element of B. For 
three sets A, B, and C the Cartesian product consists of triples {(a, b, c)}, and so 
forth. Consider two classical particles in one dimension, with phase spaces X ] , p\ 
and *2, pi- The phase space for the composite system consists of pairs of points 
from the two phase spaces, that is, it is a collection of quadruples of the form 
x\, p\,X2, Pi, which can equally well be written in the order X\,X2, pi, Pi- This is 
formally the same as the phase space of a single particle in two dimensions, a col- 
lection of quadruples x, y, p x , p y . Similarly, the six-dimensional phase space of a 
particle in three dimensions is formally the same as that of three one-dimensional 
particles. 

In quantum theory the analog of a Cartesian product of classical phase spaces 
is a tensor product of Hilbert spaces. A particle in three dimensions has a Hilbert 
space which is the tensor product of three spaces, each corresponding to motion 
in one dimension. The Hilbert space for two particles, as long as they are not 
identical, is the tensor product of the two Hilbert spaces for the separate particles. 
The Hilbert space for a particle with spin is the tensor product of the Hilbert space 
of wave functions for the center of mass, appropriate for a particle without spin, 
with the spin space, which is two-dimensional for a spin-half particle. 

Not only are tensor products used in quantum theory for describing a composite 
system at a single time, they are also very useful for describing the time develop- 
ment of a quantum system, as we shall see in Ch. 8 . Hence any serious student 
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of quantum mechanics needs to become familiar with the basic facts about tensor 
products, and the corresponding notation, which is summarized in Sec. 6.2. 

Special rules apply to the tensor product spaces used for identical quantum par- 
ticles. For identical bosons one uses the symmetrical subspace of the Hilbert space 
formed by taking a tensor product of the spaces for the individual particles, while 
for identical fermions one uses the antisymmetrical subspace. The basic procedure 
for constructing these subspaces is discussed in various introductory and more ad- 
vanced textbooks (see references in the Bibliography), but the idea behind it is 
probably easiest to understand in the context of quantum field theory, which lies 
outside the scope of this book. While we shall not discuss the subject further, it 
is worth pointing out that there are a number of circumstances in which the fact 
that the particles are identical can be ignored — that is, one makes no significant 
error by treating them as distinguishable — because they are found in different lo- 
cations or in different environments. For example, identical nuclei in a solid can be 
regarded as distinguishable as long as it is a reasonable physical approximation to 
assume that they are approximately localized, e.g., found in a particular unit cell, 
or in a particular part of a unit cell. In such cases one can construct the tensor 
product spaces in a straightforward manner using the principles described below. 


6.2 Definition of tensor products 

Given two Hilbert spaces A and B, their tensor product A <8> B can be defined 
in the following way, where we assume, for simplicity, that the spaces are finite- 
dimensional. Let {| aj) : j — 1, 2, . . .m} be an orthonormal basis for the tri- 
dimensional space A, and {| b p ) : p = 1, 2, . . .«} an orthonormal basis for the 
n -dimensional space B, so that 

(' Q-jWk ) — &jki (bp\bq) — &pq- ( 6 . 1 ) 

Then the collection of mn elements 

Wj) ® \b p ) (6.2) 

forms an orthonormal basis of the tensor product A <8> B, which is the set of all 
linear combinations of the form 

= (6-3) 

j p 

where the yj P are complex numbers. 

Given kets 


l-> = E«,l«,>. I b) = ^P p \b p ) 


(6.4) 
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6.2 Definition of tensor products 
in A and B, respectively, their tensor product is defined as 

I a) 0 | b) = EE«^} 0 IV)* (6 ' 5) 

j p 

which is of the form (6.3) with 

Yjp = <XjP p . (6.6) 

The parentheses in (6.3) and (6.5) are not really essential, since (a|a)) 0 | b) is 
equal to a(\a) 0 \b)), and we shall henceforth omit them when this gives rise to 
no ambiguities. The definition (6.5) implies that the tensor product operation 0 is 
distributive: 

I a) 0 (m + P"\b")) = p'\ a) 0 \b') + P"\ a) 0 | b"), 

(a'\ a') + a"\a ") ) 0 | b) — a'\a '} 0 | b) + a"\a ") 0 | b). 

An element of A 0 B which can be written in the form \a) 0 \b) is called a prod- 
uct state, and states which are not product states are said to be entangled. When 
several coefficients in (6.3) are nonzero, it may not be readily apparent whether the 
corresponding state is a product state or entangled, that is, whether or not Yjp can 
be written in the form (6.6). For example, 

1.0|ai) 0 \b x ) + 0.5|oi> 0 | b 2 ) - 1.0| a 2 ) 0 \bi) - 0.5| a 2 ) 0 \b 2 ) (6.8) 

is a product state (|ai) — \a 2 }) 0 (\b] ) + 0 . 5 1 A 2 ) ) , whereas changing the sign of the 
last coefficient yields an entangled state: 

1.0|oi) 0 \bi) + 0.5|ai) 0 \b 2 ) - 1.0| a 2 ) 0 \b x ) + 0.5|a 2 ) 0 \b 2 ). (6.9) 

The linear functional or bra vector corresponding to the product state | a) 0 | b) 
is written as 

(|a) 0 |^>) f = {a\ 0 {b\, (6.10) 

where the A 0 B order of the factors on either side of 0 does not change when 
the dagger operation is applied. The result for a general linear combination (6.3) 
follows from (6.10) and the antilinearity of the dagger operation: 

W = (ivof = Y^Y*p( a i \ 0 {bp\. (6.11) 

jp 

Consistent with these formulas, the inner product of two product states is given by 
(la) 0 |fr>) t (|a'> 0 | b’)) = (a\a') • (b\b'), (6.12) 

and of a general state \xf), (6.3), with another state 

i^> = YlYj P \ a i) ® iv» 

jp 


(6.13) 
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by the expression 

# 1 ^) = J^yJpYjp- ( 6 - 14 ) 

jp 

Because the definition of a tensor product given above employs specific or- 
thonormal bases for A and B, one might suppose that the space A®B somehow 
depends on the choice of these bases. But in fact it does not, as can be seen by 
considering alternative bases {| a' k )} and {| b' q )}. The kets in the new bases can be 
written as linear combinations of the original kets, 

K> = !>>*>• M’ = (6.1.5) 

j p 

and (6.5) then allows \a' k ) 0 \b' q ) to be written as a linear combination of the kets 
\aj) 0 | b p ). Hence the use of different bases for A or B leads to the same tensor 
product space A®B, and it is easily checked that the property of being a product 
state or an entangled state does not depend upon the choice of bases. 

Just as for any other Hilbert space, it is possible to choose an orthonormal basis 
of A 0 B in a large number of different ways. We shall refer to a basis of the type 
used in the original definition, {| aj) 0 \b p )}, as a product of bases. An orthonormal 
basis of A 0 B may consist entirely of product states without being a product of 
bases; see the example in (6.22). Or it might consist entirely of entangled states, or 
of some entangled states and some product states. 

Physicists often omit the 0 and write | a) 0 | b) in the form \a)\b), or more 
compactly as | a, b), or even as \ab). Any of these notations is perfectly adequate 
when it is clear from the context that a tensor product is involved. We shall often 
use one of the more compact notations, and occasionally insert the 0 symbol for 
the sake of clarity, or for emphasis. Note that while a double label inside a ket, as 
in | a, b), often indicates a tensor product, this is not always the case; for example, 
the double label \l, m) for orbital angular momentum kets does not signify a tensor 
product. 

The tensor product of three or more Hilbert spaces can be obtained by an obvious 
generalization of the ideas given above. In particular, the tensor product A 0 B 0 C 
of three Hilbert spaces A, B, C, consists of all linear combinations of states of the 
form 


\aj) 0 \b p ) 0 |c,), (6.16) 

using the bases introduced earlier, together with {|c s ): s = 1, 2, . . . }, an orthonor- 
mal basis for C. One can think of A 0 B 0 C as obtained by first forming the tensor 
product of two of the spaces, and then taking the tensor product of this space with 
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the third. The final result does not depend upon which spaces form the initial pair- 
ing: 

A 0 B 0 C = (A 0 B) 0 C = A 0 (B 0 C) = (A 0 C) 0 B. (6.17) 

In what follows we shall usually focus on tensor products of two spaces, but for the 
most part the discussion can be generalized in an obvious way to tensor products 
of three or more spaces. Where this is not the case it will be pointed out explicitly. 

Given any state \xf) in *4® B, it is always possible to find particular orthonormal 
bases {|d/)} for A and {\b p }} for B such that [iff) takes the form 

\x/r) = 0 \bj). (6.18) 

j 

Here the kj are complex numbers, but by choosing appropriate phases for the basis 
states, one can make them real and nonnegative. The summation index j takes 
values between 1 and the minimum of the dimensions of A and B. The result 
(6.18) is known as the Schmidt decomposition of \x/r); it is also referred to as the 
biorthogonal or polar expansion of \ f ). It does not generalize, at least in any 
simple way, to a tensor product of three or more Hilbert spaces. 

Given an arbitrary Hilbert space 7 i of dimension ran, with m and n integers 
greater than 1, it is possible to “decompose” it into a tensor product A® B, with 
m the dimension of A and n the dimension of B; indeed, this can be done in many 
different ways. Let {| hi)} be any orthonormal basis of 7 i, with l = 1,2,.. .mn. 
Rather than use a single label for the kets, we can associate each l with a pair 
j, p, where j takes values between 1 and m, and p values between 1 and n. Any 
association will do, as long as it is unambiguous (one-to-one). Let {| hj p )} denote 
precisely the same basis using this new labeling. Now write 

I h jp ) = \ aj ) 0 | b p ), (6.19) 

where the {|a ; >} for j between 1 and m are defined to be an orthonormal basis of 
a Hilbert space A, and the {|& p )} for p between 1 and n the orthonormal basis of 
a Hilbert space B. By this process we have turned 7 i into a tensor product A 0 B, 
or it might be better to say that we have imposed a tensor product structure A 0 B 
upon the Hilbert space H. In the same way, if the dimension of hi is the product 
of three or more integers greater than 1, it can always be thought of as a tensor 
product of three or more spaces, and the decomposition can be carried out in many 
different ways. 


6.3 Examples of composite quantum systems 

Figure 6.1(a) shows a toy model involving two particles. The first particle can be 
at any one of the M = 6 sites indicated by circles, and the second particle can be 
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at one of the two sites indicated by squares. The states | m) for m between 0 and 
5 span the Hilbert space M. for the first particle, and \n) for n = 0, 1 the Hilbert 
space AT for the second particle. The tensor product space Af0A/ r is6x2 = 12 
dimensional, with basis states | m) 0 | n) — \m,n). (In Sec. 7.4 we shall put this 
arrangement to good use: the second particle will be employed as a detector to 
detect the passage of the first particle.) One must carefully distinguish the case of 
two particles, one located on the circles and one on the squares in Fig. 6.1(a), from 
that of a single particle which can be located on either the circles or the squares. 
The former has a Hilbert space of dimension 12, and the latter a Hilbert space of 
dimension 6 + 2 = 8. 


(a) 


□ 


n = 1 



0 

1 


□ n = 0 

o o o 

2 3 4 


O 

5 


a = +1 O O O 
(b) a = -1 O O O 

m= 0 1 2 


O 

O 

3 


O O 

o o 

4 5 


Fig. 6. 1 . Toy model for: (a) two particles, one located on the circles and one on the squares; 
(b) a particle with an internal degree of freedom. 


A second toy model, Fig. 6.1(b), consists of a single particle with an internal 
degree of freedom represented by a “spin” variable which can take on two possible 
values. The center of mass of the particle can be at any one of six sites correspond- 
ing to a six-dimensional Hilbert space M., whereas the spin degree of freedom is 
represented by a two-dimensional Hilbert space S. The basis kets of M. 0 S have 
the form | m, o), with a = ±1. The figure shows two circles at each site, one cor- 
responding to a — +1 (“spin up”), and the other to a — — 1 (“spin down”), so one 
can think of each basis state as the particle being “at” one of the circles. A general 
element | r/r) of the Hilbert space M. 0 S is a linear combination of the basis kets, 
so it can be written in the form 

m = EE \jr(m, a)\m, cr), (6.20) 

where the complex coefficients ^(m, c r) form a toy wave function; this is sim- 
ply an alternative way of writing the complex coefficients y JP in (6.3). The toy 
wave function xjr(m, a) can be thought of as a discrete analog of the wave function 
i//-(r, cr) used to describe a spin-half particle in three dimensions. Just as in the 
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toy model, the Hilbert space to which x/rir, a) belongs is a tensor product of the 
space of wave functions xjr(r), appropriate for a spinless quantum particle, with the 
two-dimensional spin space. 

Consider two spin-half particles a and b, such as an electron and a proton, and 
ignore their center of mass degrees of freedom. The tensor product H of the two 
2-dimensional spin spaces is a four-dimensional space spanned by the orthonormal 
basis 

]#>®l zth l£>®lzjK \z~)®\zt), Ife-) ® ) (6.21) 

in the notation of Sec. 4.2. This is a product of bases in the terminology of Sec. 6.2. 
By contrast, the basis 

\zt) ® k*>. \zZ)®\z h }:, \z~}®\x+h \z~)®\x^), (6.22) 

even though it consists of product states, is not a product of bases, because one 
basis for B is employed along with |z+), and a different basis along with \z~). Still 
other bases are possible, including cases in which some or all of the basis vectors 
are entangled states. 

The spin space for three spin-half particles a, b, and c is an eight-dimensional 
tensor product space, and the state |z+) 0 | z£) ® |z+) along with the seven other 
states in which some of the pluses are replaced by minuses forms a product basis. 
For N spins, the tensor product space is of dimension 2 ,v . 


6.4 Product operators 

Since A ® B is a Hilbert space, operators on it obey all the usual rules, Sec. 3.3. 
What we are interested in is how these operators are related to the tensor product 
structure, and, in particular, to operators on the separate factor spaces A and B. In 
this section we discuss the special case of product operators, while general oper- 
ators are considered in the next section. The considerations which follow can be 
generalized in an obvious way to a tensor product of three or more spaces. 

If A is an operator on A and B an operator on B, the ( tensor) product operator 
A® B acting on a product state | a) ® \b) yields another product state: 

(A ® B)(\a) ® | b)) = (A|a)) 0 (6.23) 

Since A 0 B is by definition a linear operator on A 0 B, one can use (6.23) to 
define its action on a general element \xjr), (6.3), of A 0 B: 

(A 0 B)[j^Y jp (\aj) 0 IM)] = 1 ® B\b p )). (6.24) 

ip jp 

The tensor product of two operators which are themselves sums of other operators 
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can be written as a sum of product operators using the usual distributive rules. 
Thus: 

( a A + a' A') <8 > 6 = a(A 0 6) + a' (A' 0 6), 

, , , ( 6 . 25 ) 

A 0 (/16 + /J B ) = /f(A 0 6) + P (A 0 6). 

The parentheses on the right side are not essential, as there is no ambiguity when 
a(A 0 6) = (aA) 0 6 is written as aA 0 6. 

If |i/r) = |a) 0 |£>) and \<f>) — \ a') 0 | b') are both product states, the dyad \^){(p\ 
is a product operator: 

(|a> 0 |*»«a'| 0 (b'\) = (| a>(a'|) 0 {\b)(b'\). ( 6 . 26 ) 

Notice how the terms on the left are rearranged in order to arrive at the expression 
on the right. One can omit the parentheses on the right side, since | a) ( a!\ 0 | b) (b'\ 
is unambiguous. 

The adjoint of a product operator is the tensor product of the adjoints in the same 
order relative to the symbol 0: 

(A 0 6)+ = A f 0 6+. ( 6 . 27 ) 

Of course, if the operators on A and B are themselves products, one must reverse 
their order when taking the adjoint: 

(AjA 2 0 616263)+ = A\a\ 0 6+6+6+. ( 6 . 28 ) 

The ordinary operator product of two tensor product operators is given by 

(A 0 6) • (A' 0 6') = AA' 0 66', ( 6 . 29 ) 

where it is important that the order of the operators be preserved: A is to the left 
of A' on both sides of the equation, and likewise 6 is to the left of 6'. An operator 
product of sums of tensor products can be worked out using the usual distributive 
law, e.g., 

(A 0 6) • (A' 0 6' + A" 0 6") = AA' 0 66' + AA" 0 66". ( 6 . 30 ) 

An operator A on Al can be extended to an operator A 0 Iq on A 0 B, where 
Is is the identity on B. It is customary to use the same symbol, A, for both the 
original operator and its extension; indeed, in practice it would often be quite awk- 
ward to do anything else. Similarly, 6 is used to denote either an operator on 
B, or its extension /_4 0 6. Consider, for example, two spin-half particles, an 
electron and a proton. It is convenient to use the symbol S ez for the operator cor- 
responding to the z-component of the spin of the electron, whether one is thinking 
of the two-dimensional Hilbert space associated with the electron spin by itself, 
the four-dimensional spin space for both particles, the infinite-dimensional space 
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of electron space-and-spin wave functions, or the space needed to describe the spin 
and position of both the electron and the proton. 

Using the same symbol for an operator and its extension normally causes no 
confusion, since the space to which the operator is applied will be evident from the 
context. However, it is sometimes useful to employ the longer notation for clarity 
or emphasis, in which case one can (usually) omit the subscript from the identity 
operator: in the operator A 0 7 it is clear that 7 is the identity on B. Note that 
(6.29) allows one to write 

A (8) B = (A 0 7) • (7 ® B) = (7 0 B) ■ (A 0 7), (6.31) 

and hence if we use A for A07 and B for 70S, A®B can be written as the operator 
product AB or BA. This is perfectly correct and unambiguous as long as it is clear 
from the context that A is an operator on A and B an operator on B. However, if 
A and B are identical (isomorphic) spaces, and B denotes an operator which also 
makes sense on A, then AB could be interpreted as the ordinary product of two 
operators on A (or on B), and to avoid confusion it is best to use the unabbreviated 
A® B. 


6.5 General operators, matrix elements, partial traces 

Any operator on a Hilbert space is uniquely specified by its matrix elements in 
some orthonormal basis, Sec. 3.6, and thus a general operator D on A 0 B is 
determined by its matrix elements in the orthonormal basis (6.2). These can be 
written in a variety of different ways: 

(jp\D\kq) - ( j , p\D\k, q) = (, ajb p \D\a k b q ) 

= {(aj\®{b p \)D(\a k }®\b q )). (6.32) 

The most compact notation is on the left, but it is not always the clearest. Note that 
it corresponds to writing bras and kets with a “double label”, and this needs to be 
taken into account in standard formulas, such as 

7 = 707 = ^^ \jp)(jp\ (6.33) 


Tr(D) = J2J2 { jP\ D \jP)’ (6 ' 34) 

i p 

which correspond to (3.54) and (3.79), respectively. 

Any operator can be written as a sum of dyads multiplied by appropriate matrix 
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elements, (3.67), which allows us to write 

D = ® I b p )(b q \), (6.35) 

jk pq 

where we have used (6.26) to rewrite the dyads as product operators. This shows, 
incidentally, that while not all operators on A 0 B are product operators, any op- 
erator can be written as a sum of product operators. The adjoint of D is then given 
by the formula 

D f = Y^^Z( a i b p\ D \ a k b q)*(\ a k)( a j\ ® \b q )(b p \), (6.36) 

jk pq 

using (6.27) and the fact the dagger operation is antilinear. If one replaces 
(i ajb p \D\a k b q }* by (a k b q \D^\ajb p ), see (3.64), (6.36) is simply (6.35) with 
D replaced by D on both sides, aside from dummy summation indices. 

The matrix elements of a product operator using the basis (6.2) are the products 
of the matrix elements of the factors: 

(djb p \A 0 B\a k b q ) = {aj\A\a k ) ■ {b p \B\b q ). (6.37) 

From this it follows that the trace of a product operator is the product of the traces 
of its factors: 

Tr[A 0 B] = J2(aj\A\aj) • = Tr A [A] • Tr b [B]. (6.38) 

j p 

Here the subscripts on Tr _4 and Tr,g indicate traces over the spaces A and B, re- 
spectively, while the trace over A 0 B is written without a subscript, though one 
could denote it by Tr_ 4 g or Tta®b- Thus if A and B are spaces of dimension m and 
n, Tr_4[7] — m, Trg[7] — n, and Tr[7] — mn. 

Given an operator D on A 0 B, and two basis states \b p ) and \b q ) of B, one can 
define (b p \D\b q ) to be the (unique) operator on A which has matrix elements 

(aj\({b p \D\b q ))\a k ) = (ajb p \D\a k b q ). (6.39) 

The partial trace over B of the operator D is defined to be a sum of operators of 
this type: 

D A = Tr B [7 7] = J2( b p\ D \b P )- (6-40) 

p 

Alternatively, one can define Da to be the operator on A with matrix elements 

{aj\D A \a k ) = ^2{ajb p \D\a k b p ). (6.41) 

p 

Note that the B state labels are the same on both sides of the matrix elements on 
the right sides of (6.40) and (6.41), while those for the A states are (in general) 
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different. Even though we have employed a specific orthonormal basis of B in 
(6.40) and (6.41), it is not hard to show that the partial trace D _ 4 is independent of 
this basis; that is, one obtains precisely the same operator if a different orthonormal 
basis {\b' p )} is used in place of {| b p )}. 

If D is written in the form (6.35), its partial trace is 


D A = Trgl D\ = d jk \aj)(a k \, (6.42) 

jk 


where 


dj k = Y2{ajb p \D\a k b p ), (6.43) 

p 

since the trace over B of \b p ) (b q | is {b p \b q ) — 8 pq . In the special case of a product 
operator A 0 B, the partial trace over B yields an operator 

Tr B [A<g>B] = (Tt b [B])A (6.44) 


proportional to A. 

In a similar way, the partial trace of an operator D on A 0 B over A yields an 
operator 


D B = Tr A [D] (6.45) 

acting on the space B, with matrix elements 

(b p \D B \b q ) = J2(ajb P \D\ajb q ). (6.46) 

i 

Note that the full trace of D over A 0 B can be written as a trace of either of its 
partial traces: 


Tr[D] = TiufEU] = Tr B \D B \. (6.47) 

All of the above can be generalized to a tensor product of three or more spaces 
in an obvious way. For example, if E is an operator on A < 8 > B 0 C, its matrix 
elements using the orthonormal product of bases in (6.16) are of the form 

(jpr\E\kqs) - {ajb p c r \E\a k b q c s ) . (6.48) 

The partial trace of E over C is an operator on A 0 B, while its partial trace over 
B 0 C is an operator on A, etc. 
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6.6 Product properties and product of sample spaces 

Let A and B be projectors representing properties of two physical systems A and 
B, respectively. It is easy to show that 

P = A 0 B = (A 0 7) • (/ (8) B) (6.49) 

is a projector, which therefore represents some property on the tensor product space 
A 0 B of the combined system. (Note that if A projects onto a pure state | a) 
and B onto a pure state | b), then P projects onto the pure state | a) 0 \b).) The 
physical significance of P is that A has the property A and B has the property B. 
In particular, the projector A 0 I has the significance that A has the property A 
without reference to the system B, since the identity I = Ib operator in A 0 1 is the 
property which is always true for B, and thus tells us nothing whatsoever about B. 
Similarly, 70S means that B has the property B without reference to the system A. 
The product of A 0 1 with 7 0S — note that the two operators commute with each 
other — represents the conjunction of the properties of the separate subsystems, in 
agreement with the discussion in Sec. 4.5, and consistent with the interpretation of 
P given previously. As an example, consider two spin-half particles a and b. The 
projector [z+] 0 \x^ | means that S az — +1/2 for particle a and Sf, x — — 1/2 for 
particle b. 

The interpretation of projectors on A 0 B which are not products of projectors 
is more subtle. Consider, for example, the entangled state 

m - \Za)\4))/^ (6-50) 

of two spin-half particles, and let [i^j be the corresponding dyad projector. Since 
[x/r] projects onto a subspace of A®B, it represents some property of the combined 
system. However, if we ask what this property means in terms of the a spin by 
itself, we run into the difficulty that the only projectors on the two-dimensional 
spin space A which commute with [^] are 0 and the identity 7. Consequently, any 
“interesting” property of A, something of the form S aw — +1/2 for some direction 
w, is incompatible with [x/r]. Thus |i// 1 cannot be interpreted as meaning that the 
a spin has some property, and likewise it cannot mean that the b spin has some 
property. 

The same conclusion applies to any entangled state of two spin-half particles. 
The situation is not quite as bad if one goes to higher-dimensional spaces. For 
example, the projector [0] corresponding to the entangled state 

\<P) = (|1> ® |0> + |2) 0 |l))/-s/2 (6.51) 

of the toy model with two particles shown in Fig. 6.1(a) commutes with the 



6.6 Product properties and product of sample spaces 


93 


projector 

([1] + [2]) 0 7 (6.52) 

for the first particle, and thus if the combined system is described by [</>], one can 
say that the first particle is not outside the interval containing the sites m = 1 and 
m — 2, although it cannot be assigned a location at one or the other of these sites. 
However, one can say nothing interesting about the second particle. 

A product of sample spaces or product of decompositions is a collection of pro- 
jectors [Aj 0 B p } which sum to the identity 

I = J2 a j® b p (6-53) 

jp 

of A 0 B, where {Aj} is decomposition of the identity for A, and { B p \ a decom- 
position of the identity for B. Note that the event algebra corresponding to (6.53) 
contains all projectors of the form {Aj 0 7} or {7 0 B p ), so these properties of 
the individual systems make sense in a description of the composite system based 
upon this decomposition. A particular example of a product of sample spaces is 
the collection of dyads corresponding to the product of bases in (6.2): 

1 = Y J \aj)^j\®\b P )(bpl (6.54) 

jp 

A decomposition of the identity can consist of products of projectors without 
being a product of sample spaces. An example is provided by the four projectors 

[z+]®[4], [z« I ® lz fe |. [Z fl “]®[* fe + ], [Z~]®[xf] (6.55) 

corresponding to the states in the basis (6.22) for two spin-half particles. (As noted 
earlier, (6.22) is not a product of bases.) The event algebra generated by (6.55) 
contains the projectors [z+] 0 7 and \zf ] 0 7, but it does not contain the projectors 
70[z^ |, 7 0U/7 1, 70 [x+ ] or 7 0 \xf ] . Consequently one has the odd situation that 
if the state [z“] 0 \x^ ], which would normally be interpreted to mean S az = — 1 /2 
AND Sb x — +1/2, is a correct description of the system, then using the event 
algebra based upon (6.55), one can infer that S az = —1/2 for spin a, but one 
cannot infer that Sb x — +1/2 is a property of spin b by itself, independent of 
any reference to spin a. Further discussion of this peculiar state of affairs, which 
arises when one is dealing with dependent or contextual properties, will be found 
in Ch. 14. 
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7.1 The Schrodinger equation 

The equations of motion of classical Hamiltonian dynamics are of the form 
dxt 3H d Pi 3 H 
~dt ~ Ifpi’ ~dt ~ ~d V (7 ' ) 

where xi,X 2 , etc. are the (generalized) coordinates, p\ , pi, etc. their conjugate 
momenta, and H(x\, p\, X 2 , pi, ■ ■ ■ ) is the Hamiltonian function on the classical 
phase space. 

In the case of a particle moving in one dimension, there is only a single coordi- 
nate v and a single momentum p, and the Hamiltonian is the total energy 

H = p 2 /2m + V(x), (7.2) 

with V (x) the potential energy. The two equations of motion are then: 

dx/dt = p/m, dp/dt — —dV/dx. (7.3) 

For a harmonic oscillator V(x) is \Kx 2 , and the general solution of (7.3) is given 
in (2.1), where a> = *jK/m. 

The set of equations (7.1) is deterministic in that there is a unique trajectory or 
orbit y (t ) in the phase space as a function of time which passes through y () at t = 0. 
Of course, the orbit is also determined by giving the point in phase space through 
which it passes at some time other than t = 0. The orbit for a harmonic oscillator 
is an ellipse in the phase plane; see Fig. 2.1 on page 12. 

The quantum analog of (7.1) is Schrodinger’ s equation, which in Dirac notation 
can be written as 

= HWtU (7.4) 

dt 

where H is the quantum Hamiltonian for the system, a Hermitian operator which 
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may itself depend upon the time. This is a linear equation, so that if | </> t ) and | co t ) 
are any two solutions, the linear combination 

\Xt) = a\<t>,) + P\co,) (7.5) 


is also a solution, where a and ft are arbitrary (time-independent) constants. Equa- 
tion (7.4) is deterministic in the same sense as (7.1): a given |^o) at t — 0 gives 
rise to a unique solution \\jr t ) for all values of t. The result is a unitary dynamics 
for the quantum system in a sense made precise in Sec. 7.3. 

The Hamiltonian H in (7.4) must be an operator defined on the Hilbert space 
TL of the system one is interested in. This will be true for an isolated system, one 
which does not interact with anything else — imagine something inside a com- 
pletely impermeable box. It will also be true if the interaction of the system with 
the outside world can be approximated by an operator acting only on TL. For ex- 
ample, the system may be located in an external magnetic field which is effectively 
“classical”, that is, does not have to be assigned its own quantum mechanical de- 
grees of freedom, and thus enters the Hamiltonian H simply as a parameter. 

One is sometimes interested in the dynamics of an open (in contrast to isolated) 
subsystem A of a composite system A®B when there is a significant interaction 
between A and B. Of course (7.4) can be applied to the total composite system, 
assuming that it is isolated. However, there is no comparable equation for the sub- 
system A, as it cannot, at least in general, be described by its own wave function, 
and its dynamical evolution is influenced by that of the other subsystem B, often 
referred to as the environment of A. Constructing dynamical equations for open 
subsystems is a topic which lies outside the scope of this book, although Ch. 15 
on density matrices provides some preliminary hints on how to think about open 
subsystems. 

For a particle in one dimension moving in a potential V ( x ), (7.4) is equivalent 
to the partial differential equation 


ih 


dx/s 

~dt 


h 2 aV 

2m dx 2 


+ V(x) r/r 


(7.6) 


for a wave packet xf(x,t) which depends upon the time as well as the position 
variable x. The Hamiltonian in this case is the linear differential operator 


h 2 a 2 


( 1 . 1 ) 


In general it is much more difficult to find solutions to (7.6) than it is to integrate 
(7.3). A formal solution to (7.6) for a harmonic oscillator is given in (7.23). 

One way to think about (7.4) is to choose an orthonormal basis {| j), j — 
1, 2, . . . } of the Hilbert space TL which is independent of the time t. Then (7.4) 



96 


Unitary dynamics 


is equivalent to a set of ordinary differential equations, one for each j : 

ihj t (Mt) = U\H\1rt). (7.8) 

This is somewhat less abstract than (7.4), because both (j\t/s t ) and (j\H\xj/ t ) are 
simply complex numbers which depend upon t, and thus are complex- valued func- 
tions of the time. By writing \x/r t ) as a linear combination of the basis vectors, 

m = J2 \MMt) = J> 7 (0 \j), (7.9) 

j j 

with time-dependent coefficients cj(t), and expressing the right side of (7.8) in the 
form 

W\Hm = '£ i u\H\m k m, (7.io) 

k 

one finds that the Schrodinger equation is equivalent to a collection of coupled 
linear differential equations 

ihdCj/dt = Y,U\H\k)c k (7.11) 

k 

for the cj(t). The operator H, and therefore also its matrix elements, can be a 
function of the time, but it must be a Hermitian operator at every time, that is, for 
any j and k, 

U\H(t)\k) = (k\H(t)\j)*. (7.12) 

When H is two-dimensional, (7.11) has the form 

ihda/dt = {i\H\l)c v + (l\H\2)c 2 , 

ihdc 2 /dt = (2|tf|l)ci + (2\H\2)c 2 . ' 

These are linear equations, and if H, and thus its matrix elements, is independent of 
time, one can find the general solution by diagonalizing the matrix of coefficients 
on the right side. Let us assume that this has already been done, since we earlier 
made no assumptions about the basis {|y )}, apart from the fact that it is independent 
of time. That is, assume that 

H = £r|l)(l| + E 2 \2)(2\, (7.14) 

so that the off-diagonal terms {\\H\2) and (2|/7|l) in (7.13) vanish, while the 
diagonal terms are E\ and E 2 . Then the general solution of (7.13) is of the form 

c, (t) - b x e iE ' t/h , c 2 (t ) = b 2 e~ iE2t/h , (7.15) 


where b\ and b 2 are arbitrary (complex) constants. 
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The precession of the spin angular momentum of a spin-half particle placed 
in a constant magnetic field is an example of a two-level system with a time- 
independent Hamiltonian. If the magnetic field is B = ( B x , B y , B z ), the Hamilto- 
nian is 


H = —y(B x S x + B } S y + B-S-), (7.16) 

where the spin operators S x , etc., are defined in the manner indicated in (5.30), and 
y is the gyromagnetic ratio of the particle in suitable units. This Hamiltonian will 
be diagonal in the basis |in + ), \w~), see (4.14), where w is in the direction of the 
magnetic field B. 

The preceding example has an obvious generalization to the case in which a 
time-independent Hamiltonian is diagonal in an orthonormal basis {\e n )}\ 

H = Y j E n \e n ){e n |. (7.17) 

Then a general solution of Schrodinger’ s equation has the form 

W t ) = J2 h n e- ,En,/h \e n ), (7.18) 

where the b n are complex constants. One can check this by evaluating the time 
derivative 

= J2 b n E n e~ iEnt/h \e n ), (7.19) 

and verifying that it is equal to H\xjr t ). An alternative way of writing (7.18) is 

W t )= e - itHlh m, (7.20) 

where IV't)} is \^t) when t = 0. The operator e~ ltH / h is defined in the manner 
indicated in Sec. 3.10, see (3.97). It can be written down explicitly as 

e~ itH/h = J2 e~ iEnt,h \e n ) (e n I • (7.21) 

In the case of a harmonic oscillator, with 

E n = (n + \/2)hco (7.22) 

the energy and \<fi n ) the eigenstate of the nth level, (7.18) is equivalent to 

x/s(x, t ) = ( e ~ i(0t/ 2 ) b n e- inmt <t> n (x), (7.23) 

where </>„ (x) is the position wave function corresponding to the ket | </>„). 
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A particular case of (7.18) is that in which b n — S np , that is, all except one of the 
b„ vanish, so that 

1 1r t )=-r iS > i ' h \e p ). (7.24) 

The only time dependence comes in the phase factor, but since two states which 
differ by a phase factor have exactly the same physical significance, a quantum state 
with a precisely defined energy, known as a stationary state, represents a physical 
situation which is completely independent of time. By contrast, a classical system 
with a precisely defined energy will typically have a nontrivial time dependence; 
e.g., a harmonic oscillator tracing out an ellipse in the classical phase plane. 

The inner product ((o t \tjr t ) of any two solutions of Schrodinger’s equation is 
independent of time. Thus \tjr t ) satisfies (7.8), while the complex conjugate of this 
equation with i/q replaced by oj t is 

= (<»t\H\j), (7.25) 

where on the right side has been replaced with H, since the Hamiltonian (which 
could depend upon the time) is Hermitian. Using (7.8) along with (7.25), one 
arrives at the result 

ih^-(co t \f t ) = ihY] 
dt “ dt 

j 

= - o, (7.26) 

j j 

since both of the last two sums are equal to {a),\H\\jr t ). This means, in particular, 
that the norm ||i//- r || of a solution \x/r,) of Schrodinger’s equation is independent of 
time, since it is the square root of the inner product of the ket with itself. 

The fact that the Schrodinger equation preserves inner products and norms means 
that its action on the ket vectors in the Hilbert space is analogous to rigidly rotating 
a collection of vectors in ordinary three-dimensional space about the origin. If one 
thinks of these vectors as arrows directed outwards from the origin, the rotation 
will leave the lengths of the vectors and the angles between them, and hence the 
dot product of any two of them, unchanged, in the same way that inner products 
of vectors in the Hilbert space are left unchanged by the Schrodinger equation. An 
operator on the Hilbert space which leaves inner products unchanged is called an 
isometry. If, in addition, it maps the space onto itself, it is a unitary operator. 
Some important properties of unitary operators are stated in the next section, and 
we shall return to the topic of time development in Sec. 7.3. 
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7.2 Unitary operators 

An operator U on a Hilbert space TL is said to be unitary provided (i) it is an 
isometry, and (ii) it maps TL onto itself. An isometry preserves inner products, so 
condition (i) is equivalent to 

(U\a>)yU\cf>) = (a>\U'U\4>) (7.27) 

for any pair of kets \oo) and \<p) in TL, and this in turn will be true if and only if 

U f U = I. (7.28) 

Condition (ii) means that given any \rj) in TL, there is some \t/r) in TL such that 
| r]) — U\\j/). This will be the case if, in addition to (7.28), 

Uf/ t = I. (7.29) 

The two equalities (7.28) and (7.29) tell us that U ' is the same as the inverse U~ l of 
the operator U. For a finite-dimensional Hilbert space, condition (ii) for a unitary 
operator is automatically satisfied in the case of an isometry, so (7.28) implies 
(7.29), and vice versa, and it suffices to check one or the other in order to show that 
U is unitary. 

If U is unitary, then so is U f . In addition, the operator product of two or more 
unitary operators is a unitary operator. This follows at once from (7.28) and (7.29) 
and the rule giving the adjoint of a product of operators, (3.32). Thus if both U and 
V satisfy (7.28), so does their product, 

(UV^UV = v'u'uv = V f IV = 1, (7.30) 

and the same is true for (7.29). 

A second, equivalent definition of a unitary operator is the following: Let {]/}} 
be some orthonormal basis of TL. Then U is unitary if and only if {U\j)} is also an 
orthonormal basis. If TL is of finite dimension, one need only check that {U\j)} is 
an orthonormal collection, for then it will also be an orthonormal basis, given that 
Hi)} is such a basis. 

The matrix {{j\U\k)} of a unitary operator in an orthonormal basis can be 
thought of as a collection of column vectors which are normalized and mutually 
orthogonal, a result which follows at once from (7.28) and the usual rule for matrix 
multiplication. Similarly, (7.29) tells one that the row vectors which make up this 
matrix are normalized and mutually orthogonal. Any 2x2 unitary matrix can be 
written in the form 



a £ 
-/J* a* 


(7.31) 
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where a and ft are complex numbers satisfying 

l«| 2 + |£| 2 = l (7.32) 

and 0 is an arbitrary phase. It is obvious from (7.32) that the two column vectors 
making up this matrix are normalized, and their orthogonality is easily checked. 
The same is true of the two row vectors. 

Given a unitary operator, one can find an orthonormal basis {|w ; }} in which it 
can be written in diagonal form 

U = J2^j\uj){uj\, (7.33) 

j 

where the eigenvalues kj of U are complex numbers with |A. ; | = 1. Just as Hermi- 
tian operators can be thought of as somewhat analogous to real numbers, since their 
eigenvalues are real, unitary operators are analogous to complex numbers of unit 
modulus. (In an infinite-dimensional space the sum in (7.33) may have to be re- 
placed by an appropriate integral.) As in the case of Hermitian operators, Sec. 3.7, 
if some of the eigenvalues in (7.33) are degenerate, the sum can be rewritten in the 
form 

= (7.34) 

k 

where the S* are projectors which form a decomposition of the identity, and k’ k ^ k\ 
for kj^l. 

All operators in a collection [U, V , W, . . . } of unitary operators which commute 
with each other can be simultaneously diagonalized using a single orthonormal 
basis. That is, there is some basis {| uj)} in which U, V, W, and so forth can 
be expressed using the same collection of dyads \uj)(uj\, as in (7.33), but with 
different eigenvalues for the different operators. If one writes down expressions 
of the form (7.34) for V, W, etc., the decompositions of the identity need not be 
identical with the {Sj } appropriate for U, but the different decompositions will all 
be compatible in the sense that the projectors will all commute with one another. 


7.3 Time development operators 

Consider integrating Schrodinger’s equation from time t = 0 to t = x starting 
from an arbitrary initial state | xj/o) . Because the equation is linear, the dependence 
of the state \xj/ T ) at time r upon the initial state | y^o) can be written in the form 

IVr t > = r(T,0)|Vr 0 ), (7.35) 

where T (t, 0) is a linear operator. And because Schrodinger’s equation preserves 
inner products, (7.26), T (r, 0) is an isometry. In addition, it maps H onto itself, 
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because if | r]) is any ket in Ti, we can treat it as a “final” condition at time x and 
integrate Schrodinger’s equation backwards to time 0 in order to obtain a ket |£) 
such that | rj) — T (r, 0)|f ). Therefore T (r, 0) is a unitary operator, since it satisfies 
the conditions given Sec. 7.2. 

Of course there is nothing special about the times 0 and r, and the same argument 
could be applied equally well to the integration of Schrodinger’s equation between 
two arbitrary times t' and t, where t can be earlier or later than t' . That is to say, 
there is a collection of unitary time development operators T ( t , t'), labeled by the 
two times t and t', such that if \f t ) is any solution of Schrodinger’s equation, then 

\f t ) = Tit,t')\f t ,). (7.36) 

These time development operators satisfy a set of fairly obvious conditions. First, 
if t' is equal to t, 

T{t, t) = I. (7.37) 

Next, since 

I ft) = Tit, t")\f t „ ) = Tit, t'M f ) = Tit, t')Tit', t")\f,„), (7.38) 

it follows that 

Tit,t')Tit',t") = Tit,t") (7.39) 

for any three times t, t' , t" . In particular, if we set t" = t in this expression and use 

(7.37), the result is 

Tit,t')Tit',t)-I. (7.40) 

Since T it, t') is a unitary operator, this tells us that 

Tit', t ) = Tit, t') f = Tit, t'y\ (7.41) 

Thus the adjoint of a time development operator, which is the same as its inverse, 
is obtained by interchanging its two arguments. 

If one applies the dagger operation to (7.36), see (3.33), the result is 

(ft\ = tyATihty = if A Tit',t). (7.42) 

Consequently, the projectors [f t ] and [ f t > | onto the rays containing | f t ) and | \jr t ') 
are related by 

If A = \ft)(f,\ = Tit,t')\fe)(fATit',t) - Tit,t')[fATit',t). (7.43) 

This formula can be generalized to the case in which P t > is any projector onto some 
subspace V t ' of the Hilbert space. Then P t defined by 

P t = Tit,t')P f Tit',t) 


(7.44) 
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is a projector onto a subspace V t with the property that if |i/v) is any ket in TV, 
its image T ( t , t')|i//y) under the time translation operator lies in V t , and V, is com- 
posed of kets of this form. That is to say, the same unitary dynamics which “moves” 
one ket onto another through (7.36) “moves” subspaces in the manner indicated in 
(7.44). The difference is that only a single T operator is needed to move kets, while 
two are necessary in order to move a projector. 

Since \\]/ t ) in (7.36) satisfies Schrodinger’s equation (7.4), it follows that 

ih dT( i J,) =H{t)Tit,t'), (7.45) 

at 

where one can write H in place of H (t ) if the Hamiltonian is independent of time. 
There is a similar equation in which the first argument of T(t, t') is held fixed, 

-ih dT{ t ,n = T{t, t')H{t'), (7.46) 

at 

obtained by taking the adjoint of (7.45) with the help of (7.41), and then inter- 
changing t and t' . Given a time-independent orthonormal basis { | / ) } , (7.45) is 
equivalent to a set of coupled ordinary differential equations for the matrix ele- 
ments of T{t, t'), 

ih^(j\Tit,t')\k) = it)\m) (m\T it , t')\k), (7.47) 

and one can write down an analogous expression corresponding to (7.46). 

Obtaining explicit forms for the time development operators is in general a very 
difficult task, since it is equivalent to integrating the Schrodinger equation for all 
possible initial conditions. However, if the Hamiltonian is independent of time, 
one can write 

Tit, t ') - e i(f ,,)H/h = J2e~ iEn(t ~ t,)/h \en){e n \, (7.48) 

where the E n and \e n ) are the eigenvalues and eigenfunctions of H, (7.17). Thus 
when the Hamiltonian is independent of time, T it, t ’) depends only on the differ- 
ence t — t' of its two arguments. 


7.4 Toy models 

The unitary dynamics of most quantum systems is quite complicated and diffi- 
cult to understand. Among the few exceptions are: trivial dynamics, in which 
T it' , t) — I independent of t and t'\ a spin-half particle in a constant magnetic 
field with Hamiltonian (7.16); and the harmonic oscillator, which has a simple 
time dependence because its energy levels have a uniform spacing, (7.22). Even a 
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particle moving in one dimension in the potential V (x) = 0 represents a nontrivial 
dynamical problem in quantum theory. Though one can write down closed-form 
solutions, they tend to be a bit messy, especially in comparison with the simple 
trajectory x — xo + ( po/m)t and p — po in the classical phase space. 

In order to gain some intuitive understanding of quantum dynamics, it is impor- 
tant to have simple model systems whose properties can be worked out explicitly 
with very little effort “on the back of an envelope”, but which allow more compli- 
cated behavior than occurs in the case of a spin-half particle or a harmonic oscilla- 
tor. We want to be able to discuss interference effects, measurements, radioactive 
decay, and so forth. For this purpose toy models resembling the one introduced in 
Sec. 2.5, where a particle can be located at one of a finite number of discrete sites, 
turn out to be particularly useful. The key to obtaining simple dynamics in a toy 
model is to make time (like space) a discrete variable. Thus we shall assume that 
the time t takes on only integer values: —1,0, 1,2, etc. These could, in princi- 
ple, be integer multiples of some very short interval of time, say 10 -50 seconds, so 
discretization is not, by itself, much of a limitation (or simplification). 

Though it is not essential, in many cases one can assume that T ( t , t') depends 
only on the time difference t — t'\ this is the toy analog of a time-independent 
Hamiltonian. Then one can write 

T(t, t') — T 1-1 ' , (7.49) 

where the symbol T without any arguments will represent a unitary operator on 
the (usually finite-dimensional) Hilbert space of the toy model. The strategy for 
constructing a useful toy model is to make T a very simple operator, as in the 
examples discussed below. Because t takes integer values, T(t,t') is given by 
integer powers of the operator T, and can be calculated by applying T several 
times in a row. To be sure, these powers can be negative, but that is not so bad, 
because we will be able to choose T in such a way that its inverse T~ l = T" is 
also a very simple operator. 

As a first example, consider the model introduced in Sec. 2.5 with a particle 
located at one of M = M a + M b + 1 sites placed in a one-dimensional line and 
labeled with an integer m in the interval 

—M a <m< M b , (7.50) 

where M a and M h are large integers. This becomes a hopping model if the time 
development operator T is set equal to the shift operator S defined by 

S\m) = | m+ 1), S\M b ) = \-M a ). (7.51) 

That is, during a single time step the particle hops one space to the right, but when 
it comes to the maximum value of m it hops to the minimum value. Thus the 
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dynamics has a “periodic boundary condition”, and one may prefer to imagine the 
successive sites as located not on a line but on a large circle, so that the one labeled 
Mb is just to the left of the one labeled — M a . One must check that T — S is unitary, 
and this is easily done. The collection of kets {|m>} forms an orthonormal basis of 
the Hilbert space, and the collection {S\m}}, since it consists of precisely the same 
elements, is also an orthonormal basis. Thus the criterion in the second definition 
in Sec. 7.2 is satisfied, and S is unitary. 



0 1 2 3 4 5 


Fig. 7.1. Toy model of particle with detector. 


To make the hopping model a bit more interesting, let us add a detector, a second 
particle which can be at only one of the two sites n = 0 or 1 indicated in Fig. 7. 1 . 
The Hilbert space H for this system is, as noted in Sec. 6.3, a tensor product M. 0A/ 
of an M-dimensional space M. for the first particle and a two-dimensional space 
M for the detector, and the 2 M kets {| m,n)} form an orthonormal basis. What 
makes the detector act like a detector is a choice for the unitary dynamics in which 
the time development operator is 

T = SR, (7.52) 

where S — S 0 I, using the notation of Sec. 6.4, is the extension to M. 0 M of the 
shift operator defined earlier on M. using (7.51), and R is defined by 


R\m, n) — | m, n) for m ^ 2, 
R\2, n) = |2, 1 -n). 


(7.53) 


Thus R does nothing at all unless the particle is at m = 2, in which case it “flips” 
the detector from n = 0 to n = 1 and vice versa. That R is unitary follows from 
the fact that the collection of kets {R\m, n)} is identical to the collection {| m, n )}, 
as all that R does is interchange two of them, and is thus an orthonormal basis of 
Tt. The extended operator 5 0/ satisfies (7.28) when S satisfies this condition, so 
it is unitary. The unitarity of T — SR is then a consequence of the fact that the 
product of unitary operators is unitary, as noted in Sec. 7.2. (While it is not hard 
to show directly that T is unitary, the strategy of writing it as a product of other 
unitary operators is useful in more complicated cases, which is why we have used 
it here.) The action of T — SR on the combined system of particle plus detector 
is as follows. At each time step the particle hops from m to m + 1 (except when 
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it makes the big jump from Mb to —M a ). The detector remains at n — 0 or at 
n — 1, wherever it happens to be, except during a time step in which the particle 
hops from 2 to 3, when the detector hops from n to 1 — n, that is, from 0 to 1 or 1 
to 0. 

What justifies calling the detector a detector? Let us use a notation in which h>- 
denotes the action of T in the sense that 

\1r)»Tm»T 2 W}» ■■■ . (7.54) 

Suppose that the particle starts off at m — 0 and the detector is in the state n = 0, 
“ready to detect the particle”, at t = 0. The initial state of the combined system of 
particle plus detector develops in time according to 

|0, 0) h* |1, 0) |2, 0} h* |3. 1) |4, 1) ••• . (7.55) 

That is to say, during the time step from t — 2 to t — 3, in which the particle 
hops from m = 2 to m = 3, the detector moves from n — 0 “ready” to n = 1 , 
“have detected the particle,” and it continues in the “have detected” state at later 
times. Not at all later times, since the particle will eventually hop from Mb > 0 
to — M a < 0, and then m will increase until, eventually, the particle will pass 
by the detector a second time and “untrigger” it. But by making M a or M h large 
compared with the times we are interested in, we can ignore this possibility. More 
sophisticated models of detectors are certainly possible, and some of these will 
be introduced in later chapters. However, the essential spirit of the toy model 
approach is to use the simplest possibility which provides some physical intuition. 
The detector in Fig. 7.1 is perfectly adequate for many purposes, and will be used 
repeatedly in later chapters. 

It is worth noting that the measurement of the particle’s position (or its passing 
the position of the detector) in this way does not influence the motion of the parti- 
cle: in the absence of the detector one would have the same sequence of positions 
m as a function of time as those in (7.55). But is it not the case that any quantum 
measurement perturbs the measured system? One of the benefits of introducing 
toy models is that they make it possible to study this and other pieces of quantum 
folklore in specific situations. In later chapters we will explore the issue of per- 
turbations produced by measurements in more detail. For the present it is enough 
to note that quantum measurement apparatus can be designed so that it does not 
perturb certain properties, even though it may perturb other properties. 

Another example of a toy model is the one in Fig. 7.2, which can be used to 
illustrate the process of radioactive decay. Consider alpha decay, and adopt the 
picture in which an alpha particle is rattling around inside a nucleus until it even- 
tually tunnels out through the Coulomb barrier. One knows that this is a fairly 
good description of the escape process, even though it is bad nuclear physics if 
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1 2 3 



Fig. 7.2. Toy model for alpha decay. 


taken too literally. However, the unitary time development of a particle tunneling 
through a potential barrier is not easy to compute; one needs WKB formulas and 
other approximations. 

By contrast, the unitary time development of the toy model in Fig. 7.2, which is 
a slight modification of the hopping model (without a detector) introduced earlier, 
can be worked out very easily. The different sites represent possible locations of the 
alpha particle, with the m = 0 site inside the nucleus, and the other sites outside 
the nucleus. At each time step, the particle at m — 0 can either stay put, with 
amplitude a, or escape to m — 1, with amplitude 0 Once it reaches m — 1, it 
hops to m — 2 at the next time step, and then on to m — 3, etc. Eventually it 
will hop from m = +M b to m — —M a and begin its journey back towards the 
nucleus, but we will assume that M b is so large that we never have to consider the 
return process. (One could make M a and M b infinite, at the price of introducing 
an infinite-dimensional Hilbert space.) The time development operator is T = S a , 
where 

S a \m) - | m + 1} form # 0, -1, M b , S a \M b ) - \-M a ), 

S fl |0> = «|0> + £|1>, 5J — l) = y|0) + 5|l). 

Thus S a is identical to the simple shift S of (7.51), except when applied to the two 
kets |0) and | — 1). 

The operator S a is unitary if the complex constants a, 0 y, 8 form a unitary 
matrix 


a 0 

y s 


(7.57) 


If we use the criterion, Sec. 7.2, that the row vectors are normalized and mutually 
orthogonal, the conditions for unitarity can be written in the form: 


N 2 + |£| 2 = 1 - Ik| 2 + I<5| 2 , a*Y + fi*S = 0. (7.58) 


That S a is unitary when (7.58) is satisfied can be seen from the fact that it maps the 
orthonormal basis {| m)} into an orthonormal collection of vectors, which, since the 
Hilbert space is finite, must itself be an orthonormal basis. In particular, S a applied 
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to |0) and to | — 1} yields two normalized vectors which are mutually orthogonal to 
each other, a result ensured by (7.58). 

Note how the requirement of unitarity leads to the nontrivial consequence that if 
the action of the shift operator S on |0) is modified so that the particle can either hop 
or remain in place during one time step, there must be an additional modification 
of S someplace else. In this example the other modification occurs at | — 1 ) , which 
is a fairly natural place to put it. The fact that \y\ — |/3| means that if there is 
an amplitude for the alpha particle to escape from the nucleus, there is also an 
amplitude for an alpha particle approaching the nucleus along the m < 0 sites to 
be captured at m = 0, rather than simply being scattered tom - 1 . As one might 
expect, |/3| 2 is the probability that the alpha particle will escape during a particular 
time step, and \a\ 2 the probability that it will remain in the nucleus. However, 
showing that this is so requires additional developments of the theory found in the 
following chapters; see Secs. 9.5 and 12.4. 

The unitary time development of an initial state |0) at t — 0, corresponding to 
the alpha particle being inside the nucleus, is easily worked out. Using the i->- 
notation of (7.54), one has: 

|0> a|0> + y0|l> i-> a 2 |0) + aj8|l) + 0\2) 

i->- or 3 10) + u 2 ji\\) + ce/3 12) + /3|3) ^ , (7.59) 

so that for any time t > 0, 

I f t ) = T r |0> = a'|0) + a r_1 £|l) + a'- 2 l3\2) + • • • p\t). (7.60) 

The magnitude of the coefficient of |0) decreases exponentially with time. The rest 
of the time development can be thought of in the following way. An “initial wave” 
reaches site m at t = m. Thereafter, the coefficient of | m) decreases exponentially. 
That is, the wave function is spreading out and, at the same time, its amplitude is 
decreasing. These features are physically correct in that they will also emerge from 
a more sophisticated model of the decay process. Even though not every detail of 
the toy model is realistic, it nonetheless provides a good beginning for understand- 
ing some of the quantum physics of radioactive and other decay processes. 
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8.1 Introduction 

Despite the fact that classical mechanics employs deterministic dynamical laws, 
random dynamical processes often arise in classical physics, as well as in everyday 
life. A stochastic or random process is one in which states-of-affairs at successive 
times are not related to one another by deterministic laws, and instead probability 
theory is employed to describe whatever regularities exist. Tossing a coin or rolling 
a die several times in succession are examples of stochastic processes in which the 
previous history is of very little help in predicting what will happen in the future. 
The motion of a baseball is an example of a stochastic process which is to some 
degree predictable using classical equations of motion that relate its acceleration 
to the total force acting upon it. However, a lack of information about its initial 
state (e.g., whether it is spinning), its precise shape, and the condition and motion 
of the air through which it moves limits the precision with which one can predict 
its trajectory. 

The Brownian motion of a small particle suspended in a fluid and subject to ran- 
dom bombardment by the surrounding molecules of fluid is a well- studied example 
of a stochastic process in classical physics. Whereas the instantaneous velocity of 
the particle is hard to predict, there is a probabilistic correlation between succes- 
sive positions, which can be predicted using stochastic dynamics and checked by 
experimental measurements. In particular, given the particle’s position at a time t, 
it is possible to compute the probability that it will have moved a certain distance 
by the time t + At. The stochastic description of the motion of a Brownian particle 
uses the deterministic law for the motion of an object in a viscous fluid, and as- 
sumes that there is, in addition, a random force or “noise” which is unpredictable, 
but whose statistical properties are known. 

In classical physics the need to use stochastic rather than deterministic dynam- 
ical processes can be blamed on ignorance. If one knew the precise positions and 
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velocities of all the molecules making up the fluid in which the Brownian particle 
is suspended, along with the same quantities for the molecules in the walls of the 
container and inside the Brownian particle itself, it would in principle be possible 
to integrate the classical equations of motion and make precise predictions about 
the motion of the particle. Of course, integrating the classical equations of motion 
with infinite precision is not possible. Nonetheless, in classical physics one can, 
in principle, construct more and more refined descriptions of a mechanical system, 
and thereby continue to reduce the noise in the stochastic dynamics in order to 
come arbitrarily close to a deterministic description. Knowing the spin imparted 
to a baseball by the pitcher allows a more precise prediction of its future trajec- 
tory. Knowing the positions and velocities of the fluid molecules inside a sphere 
centered at a Brownian particle makes it possible to improve one’s prediction of its 
motion, at least over a short time interval. 

The situation in quantum physics is similar, up to a point. A quantum description 
can be made more precise by using smaller, that is, lower-dimensional subspaces 
of the Hilbert space. However, while the refinement of a classical description can 
go on indefinitely, one reaches a limit in the quantum case when the subspaces 
are one-dimensional, since no finer description is possible. However, at this level 
quantum dynamics is still stochastic: there is an irreducible “quantum noise” which 
cannot be eliminated, even in principle. To be sure, quantum theory allows for a 
deterministic (and thus noise free) unitary dynamics, as discussed in the previous 
chapter. But there are many processes in the real world which cannot be discussed 
in terms of purely unitary dynamics based upon Schrodinger’s equation. Conse- 
quently, stochastic descriptions are a fundamental part of quantum mechanics in a 
sense which is not true in classical mechanics. 

In this chapter we focus on the kinematical aspects of classical and quantum 
stochastic dynamics: how to construct sample spaces and the corresponding event 
algebras. As usual, classical dynamics is simpler and provides a valuable guide 
and useful analogies for the quantum case, so various classical examples are taken 
up in Sec. 8.2. Quantum dynamics is the subject of the remainder of the chapter. 


8.2 Classical histories 

Consider a coin which is tossed three times in a row. The eight possible outcomes 
of this experiment are HHH, HHT, HT H, . . . TTT : heads on all three tosses, 
heads the first two times and tails the third, and so forth. These eight possibilities 
constitute a sample space as that term is used in probability theory, see Sec. 5.1, 
since the different possibilities are mutually exclusive, and one and only one of 
them will occur in any particular experiment in which a coin is tossed three times 
in a row. The event algebra (Sec. 5.1) consists of the 2 8 subsets of elements in the 
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sample space: the empty set, HHH by itself, the pair {HHT, TTT), and so forth. 
The elements of the sample space will be referred to as histories, where a history is 
to be thought of as a sequence of events at successive times. Members of the event 
algebra will also be called “histories” in a somewhat looser sense, or compound 
histories if they include two or more elements from the sample space. 

As a second example, consider a die which is rolled / times in succession. The 
sample space consists of 6 f possibilities {.v t , S2, . . . s/}, where each sj takes some 
value between 1 and 6. 

A third example is a Brownian particle moving in a fluid and observed un- 
der a microscope at successive times t\ , t2, •••!/. The sequence of positions 
ri, r 2 , ... is an example of a history, and the sample space consists of all pos- 
sible sequences of this type. Since any measuring instrument has finite resolution, 
one can, if one wants, suppose that for the purpose of recording the data the region 
inside the fluid is thought of as divided up into a collection of small cubical cells, 
with r j the label of the cell containing the particle at time tj. 

A fourth example is a particle undergoing a random walk in one dimension, a 
sort of “toy model” of Brownian motion. Assume that the location of the particle 
or random walker, denoted by 5, is an integer in the range 

—M a <s <M b . (8.1) 

One could allow s to be any integer, but using the limited range (8.1) results in a 
finite sample space of M — M a + M b + 1 possibilities at any given time. At each 
time step the particle either remains where it is, or hops to the right or to the left. 
Hence a history of the particle’s motion consists in giving its positions at a set of 
times t = 0, 1 , . . . / as a sequence of integers 

s= (s Q ,si,s 2 , ■■ .Sf), (8.2) 

where each sj falls in the interval (8.1). The sample space of histories consists of 
the Mf +l different sequences s. (Letting so rather than ,s’i be the initial position 
of the particle is of no importance; the convention used here agrees with that in 
the next chapter.) One could employ histories extending to t — oo, but that would 
mean using an infinite sample space. 

This sample space can be thought of as produced by successively refining an 
initial, coarse sample space in which so takes one of M possible values, and nothing 
is said about what happens at later times. Histories involving the two times t — 0 
and 1 are produced by taking a point in this initial sample space, say so — 3, and 
“splitting it up” into two-time histories of the form (3, si), where ,s’i can take on any 
one of the M values in (8.1). Given a point, say (3, 2), in this new sample space, 
it can again be split up into elements of the form (3, 2, sf), and so forth. Note that 
any history involving less than n + 1 times can be thought of as a compound history 
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on the full sample space. Thus (3, 2) consists of all sequences s for which s 0 = 3 
and $1 = 2. Rather than starting with a coarse sample space of events at t — 0, 
one could equally well begin with a later time, such as all the possibilities for S 2 at 
t —2, and then refine this space by including additional details at both earlier and 
later times. 


8.3 Quantum histories 

A quantum history of a physical system is a sequence of quantum events at succes- 
sive times, where a quantum event at a particular time can be any quantum property 
of the system in question. Thus given a set of times t\ < t 2 < •••?/, a quantum 
history is specified by a collection of projectors (7*j, F 2 , . . . Ff), one projector for 
each time. It is convenient, both for technical and for conceptual reasons, to sup- 
pose that the number / of distinct times is finite, though it might be very large. It 
is always possible to add additional times to those in the list t\ < t 2 < ■ ■ ■ // in 
the manner indicated in Sec. 8.4. Sometimes the initial time will be denoted by to 
rather than t\ . 

For a spin-half particle, ([z + ], \x + ]) is an example of a history involving two 
times, while ([z + ], |x + ], [z + ]) is an example involving three times. 

As a second example, consider a harmonic oscillator. A possible history with 
three different times is the sequence of events 

Fx = W>i] + m, F 2 = [0!], F 3 = X, (8.3) 

where \(p n | is the projector on the energy eigenstate with energy ( n + 1/2 )ha>, and 
X is the projector defined in (4.20) corresponding to the position x lying in the 
interval x\ < x < x 2 . Note that the projectors making up a history do not have to 
project onto a one-dimensional subspace of the Hilbert space. In this example, F\ 
projects onto a two-dimensional subspace, F 2 onto a one-dimensional subspace, 
and X onto an infinite-dimensional subspace. 

As a third example, consider a coin tossed three times in a row. A physical coin is 
made up of atoms, so it has in principle a (rather complicated) quantum mechanical 
description. Thus a “classical” property such as “heads” will correspond to some 
quantum projector H onto a subspace of enormous dimension, and there will be 
another projector T for “tails”. Then by using the projectors 

F] =H, F 2 - T, F 3 -T (8.4) 

at successive times one obtains a quantum history HTT for the coin. 

As a fourth example of a quantum history, consider a Brownian particle sus- 
pended in a fluid. Whereas this is usually described in classical terms, the particle 
and the surrounding fluid are, in reality, a quantum system. At time tj let Fj be the 
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projector, in an appropriate Hilbert space, for the property that the center of mass 
of the Brownian particle is inside a particular cubical cell. Then (F\, F 2 , . . . Ff) 
is the quantum counterpart of the classical history iq, r 2 , . . . r / introduced earlier, 
with r j understood as a cell label, rather than a precise position. 

One does not normally think of coin tossing in “quantum” terms, and there is 
really no advantage to doing so, since a classical description is simpler, and is 
perfectly adequate. Similarly, a classical description of the motion of a Brownian 
particle is usually quite adequate. However, these examples illustrate the fact that 
the concept of a quantum history is really quite general, and is by no means limited 
to processes and events at an atomic scale, even though that is where quantum his- 
tories are most useful, precisely because the corresponding classical descriptions 
are not adequate. 

The sample space of a coin tossed / times in a row is formally the same as 
the sample space of / coins tossed simultaneously: each consists of 2^ mutually 
exclusive possibilities. Since in quantum theory the Hilbert space of a collection 
of / systems is the tensor product of the separate Hilbert spaces, Ch. 6, it seems 
reasonable to use a tensor product of / spaces for describing the different histories 
of a single quantum system at / successive times. Thus we define a history Hilbert 
space as a tensor product 

H = H x 0H 2 0-'-Hf, ( 8 . 5 ) 

where for each j, Hj is a copy of the Hilbert space hi used to describe the system at 
a single time, and O is a variant of the tensor product symbol 0. We could equally 
well write hi \ 0 Hi 0 • • ■ , but it is helpful to have a distinctive notation for a tensor 
product when the factors in it refer to different times, and reserve 0 for a tensor 
product of spaces at a single time. On the space 7 i the history (T), F 2 , . . . Ff ) is 
represented by the (tensor) product projector 

Y = Fi © F 2 © • • • F f . ( 8 . 6 ) 

That Y is a projector, that is, T f = Y = Y 2 , follows from the fact that each Fj 
is a projector, and from the rules for adjoints and products of operators on tensor 
products as discussed in Sec. 6.4. 


8.4 Extensions and logical operations on histories 

Suppose that / = 3 in (8.6), so that 

Y — F\ O F 2 © F 3 . (8.7) 

This history can be extended to additional times by introducing the identity oper- 
ator at the times not included in the initial set t\,ti,h- Suppose, for example, that 
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we wish to add an additional time t 4 later than t 3 . Then for times t\ < t 2 < h < f 4 , 
(8.7) is equivalent to 

Y = Fj O F 2 © F 3 © /, (8.8) 

because the identity operator I represents the property which is always true, and 
therefore provides no additional information about the system at f 4 . In the same 
way, one can introduce earlier and intermediate times, say to and ?i.s, in which case 
(8.7) is equivalent to 

T = / © F, © / © F 2 © F 3 © / (8.9) 

on the history space 77 for the times to < h < t\_s < t 2 < h < U- We shall 
always use a notation in which the events in a history are in temporal order, with 
time increasing from left to right. 

The notational convention for extensions of operators introduced in Sec. 6.4 jus- 
tifies using the same symbol Y in (8.7), (8.8), and (8.9). And its intuitive signifi- 
cance is precisely the same in all three cases: Y means “F, at t\ , F 2 at t 2 , and F 3 at 
f 3 ”, and tells us nothing at all about what is happening at any other time. Using the 
same symbol for F and F © I can sometimes be confusing for the reason pointed 
out at the end of Sec. 6.4. For example, the projector for a two-time history of a 
spin-half particle can be written as an operator product 

[z + ] © [* + ] = ([z + ] O /) • (/ O [*+]) (8.10) 

of two projectors. If on the right side we replace ([z + ] O /) with [z + ] and 
(/ O [x+]) with [jc + ], the result [z + | • [x + ] is likely to be incorrectly interpreted as 
the product of two noncommuting operators on a single copy of the Hilbert space 
77, rather than as the product of two commuting operators on the tensor product 
77 1 O 77 2 . Using the longer ([z+] © /) avoids this confusion. 

If histories are written as projectors on the history Hilbert space 77, the rules for 
the logical operations of negation, conjunction, and disjunction are precisely the 
same as for quantum properties at a single time, as discussed in Secs. 4.4 and 4.5. 
In particular, the negation of the history Y, “ Y did not occur”, corresponds to a 
projector 

Y = I — Y, (8.11) 

where I is the identity on 77. (Our notational convention allows us to write I as I, 
but I is clearer.) 

Note that a history does not occur if any event in it fails to occur. Thus the 
negation of HH when a coin is tossed two times in a row is not TT, but instead 
the compound history consisting of HT, T H, and TT. Similarly, the negation of 
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the quantum history 


Y — F\ Q F 2 (8.12) 

given by (8.1 1) is a sum of three orthogonal projectors, 

Y = F\Q F 2 + F,Q F 2 + Fy © F 2 , (8.13) 

where Fj means I — Fj. Note that the compound history Y in (8.13) cannot be 
written in the form G\ O G2, that is, as an event at t\ followed by another event at 
h- 

The conjunction Y AND Y', or 7 A Y', of two histories is represented by the 
product YY' of the projectors, provided they commute with each other. If YY’ / 
Y'Y, the conjunction is not defined. The situation is thus entirely analogous to the 
conjunction of two quantum properties at a single time, as discussed in Secs. 4.5 
and 4.6. Let us suppose that the history 

Y' = F[QF'OF (8.14) 

is defined at the same three times as Y in (8.7). Their conjunction is represented 
by the projector 

f'Af = Y'Y — F[F\ O F’jF'i O F3F3, (8.15) 

which is equal to YY' provided that at each of the three times the projectors in the 
two histories commute: 


Fj Fj = Fj Fj for j = 1, 2, 3. (8.16) 

However, there is a case in which Y and Y' commute even if some of the condi- 
tions in (8.16) are not satisfied. It occurs when the product of the two projectors at 
one of the times is 0, for this means that YY' = 0 independent of what projectors 
occur at other times. Here is an example involving a spin-half particle: 

Y — [x + ] O [x + ] O [z + ], 

Y' — [y + ] O [z + ] O [z - ]. 

The two projectors at t\, [x + ] and [y + ], clearly do not commute with each other, 
and the same is true at time ti. However, the projectors at ? 3 are orthogonal, and 
thus YY' = 0 = Y'Y . 

A simple example of a nonvanishing conjunction is provided by a spin-half par- 
ticle and two histories 


Y — [z + ] © I, Y' — I Q [x + ]. 


(8.18) 
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defined at the times t\ and t 2 . The conjunction is 

Y' A Y = Y'Y = YY' = [z+] © [x+], (8.19) 

and this is sensible, for the intuitive significance of (8.19) is “S z — +1/2 at t\ and 
S x = +1/2 at t 2 .” Indeed, any history of the form (8.6) can be understood as “+1 
at t \ , and F 2 at t 2 , and ... Ff at tfV This example also shows how to generate 
the conjunction of two histories defined at different sets of times. First one must 
extend each history by including I at additional times until the extended histories 
are defined on a common set of times. If the extended projectors commute with 
each other, the operator product of the projectors, as in (8.15), is the projector for 
the conjunction of the two histories. 

The disjunction “Y 1 or Y or both” of two histories is represented by a projector 

Y'vY = Y' + Y -Y'Y (8.20) 

provided Y'Y = YY'\ otherwise it is undefined. The intuitive significance of the 
disjunction of two (possibly compound) histories is what one would expect, though 
there is a subtlety associated with the quantum disjunction which does not arise in 
the case of classical histories, as has already been noted in Sec. 4.5 for the case 
of properties at a single time. It can best be illustrated by means of an explicit 
example. For a spin-half particle, define the two histories 

Y — [z + ] O [x + ], Y' = [z + ] ©[*"]. (8.21) 

The projector for the disjunction is 


Y v Y’ = Y + Y’ = [z + ] O I, (8.22) 

since in this case YY' = 0. The projector YvY’ tells us nothing at all about the spin 
of the particle at the second time: in and of itself it does not imply that S x = +1/2 
or S x — —1/2 at t 2 , since the subspace of 7 i on which it projects contains, among 
others, the history [z+] © [y + ], which is incompatible with S x having any value 
at all at t 2 . On the other hand, when the projector YvY' occurs in the context 
of a discussion in which both Y and Y' make sense, it can be safely interpreted as 
meaning (or implying) that at t 2 either S x = +1/2 or S x = —1/2, since any other 
possibility, such as S y = +1/2, would be incompatible with Y and Y' . 

This example illustrates an important principle of quantum reasoning: The con- 
text, that is, the sample space or event algebra used for constructing a quantum 
description or discussing the histories of a quantum system, can make a difference 
in how one understands or interprets various symbols. In quantum theory it is 
important to be clear about precisely what sample space is being used. 
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8.5 Sample spaces and families of histories 

As discussed in Sec. 5.2, a sample space for a quantum system at a single time 
is a decomposition of the identity operator for the Hilbert space H: a collection 
of mutually orthogonal projectors which sum to I. In the same way, a sample 
space of histories is a decomposition of the identity on the history Hilbert space 77, 
a collection {Y a } of mutually orthogonal projectors representing histories which 
sum to the history identity: 

/ = Y a . (8.23) 

It is convenient to label the history projectors with a superscript in order to be able 
to reserve the subscript position for time. Since the square of a projector is equal 
to itself, we will not need to use superscripts on projectors as exponents. 

Associated with a sample space of histories is a Boolean “event” algebra, called 
a family of histories, consisting of projectors of the form 

7 = ^Tr“7“, (8.24) 

with each n a equal to 0 or 1, as in (5. 12). Histories which are members of the sam- 
ple space will be called elementary histories, whereas those of the form (8.24) with 
two or more n a equal to 1 are compound histories. The term “family of histories” 
is also used to denote the sample space of histories which generates a particular 
Boolean algebra. Given the intimate connection between the sample space and the 
corresponding algebra, this double usage is unlikely to cause confusion. 

The simplest way to introduce a history sample space is to use a product of 
sample spaces as that term was defined in Sec. 6.6. Assume that at each time tj 
there is a decomposition of the identity I j for the Hilbert space 77/ , 

ij = Y, p ?> < 8 - 25 ) 

where the subscript j labels the time, and the superscript aj labels the different 
projectors which occur in the decomposition at this time. The decompositions 
(8.25) for different values of j could be the same or they could be different; they 
need have no relationship to one another. (Note that the sample spaces for the 
different classical systems discussed in Sec. 8.2 have this sort of product structure.) 
Projectors of the form 

Y a — P“ l O P“ 2 O • • • Py , (8.26) 

where a is an /-component label 


a = (ai, a 2 , . . . a f ), 


(8.27) 
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make up the sample space, and it is straightforward to check that (8.23) is satisfied. 
Here is a simple example for a spin-half particle with f — 2: 

h = U + ] + [z~], h = [* + ] + 1* - ]- 

The product of sample spaces consists of the four histories 

y++ = [ z +] Q[ X +], Y+- = [z+] © [*“], 

Y~ + = [z~] O [x + ], Y— = [z~] O [x~], 

in an obvious notation. The Boolean algebra or family of histories contains 2 4 = 
16 elementary and compound histories, including the null history 0 (which never 
occurs). 

Another type of sample space that arises quite often in practice consists of histo- 
ries which begin at an initial time to with a specific state represented by a projector 
'To, but behave in different ways at later times. We shall refer to it as a family based 
upon the initial state 'To- A relatively simple version is that in which the histories 
are of the form 

Y a = *(, © P“‘ © P“ 2 O • • • P a / , (8.30) 

with the projectors at times later than to drawn from decompositions of the identity 
of the type (8.25). The sum over a of the projectors in (8.30) is equal to 'Tq, so in 
order to complete the sample space one adds one more history 

Z = (7 — 'T 0 ) O / O / 0 • • • / (8.31) 

to the collection. If, as is usually the case, one is only interested in the histories 
which begin with the initial state 'To, the history Z is assigned zero probability, 
after which it can be ignored. The procedure for assigning probabilities to the 
other histories will be discussed in later chapters. Note that histories of the form 

(/ - 'To) O Py' O Pp O • • • Pf f (8.32) 

are not present in the sample space, and for this reason the family of histories based 
upon an initial state 'To is distinct from a product of sample spaces in which (8.25) 
is supplemented with an additional decomposition 

lo — 'To + (/ — 'To) (8.33) 

at time to- As a consequence, later events in a family based upon an initial state 'Tq 
are dependent upon the initial state in the technical sense discussed in Ch. 14. 

Other examples of sample spaces which are not products of sample spaces are 
used in various applications of quantum theory, and some of them will be discussed 
in later chapters. In all cases the individual histories in the sample space correspond 
to product projectors on the history space 7 i regarded as a tensor product of Hilbert 


(8.28) 

(8.29) 



118 


Stochastic histories 


spaces at different times, (8.5). That is, they are of the form (8.6): a quantum 
property at t\, another quantum property at t 2 , and so forth. Since the history space 
H is a Hilbert space, it also contains subspaces which are not of this form, but 
might be said to be “entangled in time”. For example, in the case of a spin-half 
particle and two times t\ and h, the ket 

|e> = (| z + ) O lO - k~) O |z + })/V2 (8.34) 

is an element of 7i, and therefore |e | = |e)(e| is a projector on Ti. It seems 
difficult to find a physical interpretation for histories of this sort, or sample spaces 
containing such histories. 


8.6 Refinements of histories 

The process of refining a sample space in which coarse projectors are replaced with 
finer projectors on subspaces of lower dimensionality was discussed in Sec. 5.3. 
Refinement is often used to construct sample spaces of histories, as was noted in 
connection with the classical random walk in one dimension in Sec. 8.2. Here is 
a simple example to show how this process works for a quantum system. Con- 
sider a spin-half particle and a decomposition of the identity {[z + ], | z | } at time 
t \ . Each projector corresponds to a single-time history which can be extended to a 
second time <2 in the manner indicated in Sec. 8.3, to make a history sample space 
containing 

[z + ] O I, [z ] O I. (8.35) 

If one uses this sample space, there is nothing one can say about the spin of the 
particle at the second time t 2 , since I is always true, and is thus completely un- 
informative. However, the first projector in (8.35) is the sum of [z + ] O \z + 1 and 
\z + 1 O [z“], and if one replaces it with these two projectors, and the second pro- 
jector in (8.35) with the corresponding pair [z“] © [z + | and [z~ I O [z“], the result 
is a sample space 

h+ioiZ], u+iou-L 

1- 101- 1 ]. [non, 

which is a refinement of (8.35), and permits one to say something about the spin at 
time ?2 as well as at t\ . 

When it is possible to refine a sample space in this way, there are always a 
large number of ways of doing it. Thus the four histories in (8.29) also constitute 
a refinement of (8.35). However, the refinements (8.29) and (8.36) are mutually 
incompatible, since it makes no sense to talk about S x at ^ at the same time that 
one is ascribing values to S z , and vice versa. Both (8.29) and (8.36) are products 
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of sample spaces, but refinements of (8.35) which are not of this type are also 
possible; for example, 


[z + ]0[z + ], [zlOlzl, 
[z“] 0[* + ], [z“] O [jc“]. 


(8.37) 


where the decomposition of the identity used at t 2 is different depending upon 
which event occurs at t\ . 

The process of refinement can continue by first extending the histories in (8.36) 
or (8.37) to an additional time, either later than t 2 or earlier than t\ or between t\ 
and t 2 , and then replacing the identity I at this additional time with two projec- 
tors onto pure states. Note that the process of extension does not by itself lead 
to a refinement of the sample space, since it leaves the number of histories and 
their intuitive interpretation unchanged; refinement occurs when I is replaced with 
projectors on lower-dimensional spaces. 

It is important to notice that refinement is not some sort of physical process 
which occurs in the quantum system described by these histories. Instead, it is 
a conceptual process carried out by the quantum physicist in the process of con- 
structing a suitable mathematical description of the time dependence of a quantum 
system. Unlike deterministic classical mechanics, in which the state of a system at 
a single time yields a unique description (orbit in the phase space) of what happens 
at other times, stochastic quantum mechanics allows for a large number of alterna- 
tive descriptions, and the process of refinement is often a helpful way of selecting 
useful and interesting sample spaces from among them. 


8.7 Unitary histories 

Thus far we have discussed quantum histories without any reference to the dynam- 
ical laws of quantum mechanics. The dynamics of histories is not a trivial matter, 
and is the subject of the next two chapters. However, at this point it is convenient to 
introduce the notion of a unitary history. The simplest example of such a history is 
the sequence of kets \^ t2 ), . . . \*/f tf ), where | ifr,} is a solution of Schrodinger’s 

equation, Sec. 7.3, or, to be more precise, the corresponding sequence of projectors 

| \// tl |, [i /r t2 ], The general definition is that a history of the form (8.6) is unitary 

provided 


F j = T{t j ,h)F l T{h,tj) (8.38) 

is satisfied for j = 1,2, ... /. That is to say, all the projectors in the history 
are generated from F\ by means of the unitary time development operators intro- 
duced in Sec. 7.3, see (7.44). In fact, F\ does not play a distinguished role in this 
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definition and could be replaced by F k for any k, because for a set of projectors 
given by (8.38), T ( tj , t k )F k T (t k , tj) is equal to Fj whatever the value of k. 

One can also define unitary families of histories. We shall limit ourselves to the 
case of a product of sample spaces, in the notation of Sec. 8.5, and assume that for 
each time tj there is a decomposition of the identity of the form 

h = £ P r (8-39) 

The corresponding family is unitary if for each choice of a these projectors satisfy 
(8.38), that is, 

Pf = T{t j ,h)P“T(t l ,tj) (8.40) 

for every j. In the simplest (interesting) family of this type each decomposition of 
the identity contains only two projectors; for example, \if/ t] | and I — [xf, J. Notice 
that while a unitary family will contain unitary histories, such as 

Pi O Pi O P 3 ' O • • • P}, (8.41) 

it will also contain other histories, such as 

Pi O P 2 2 O P 3 ' O • • ■ P), (8.42) 

which are not unitary. We will have more to say about unitary histories and f ami lies 
of histories in Secs. 9.3, 9.6, and 10.3. 
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9.1 Classical random walk 

The previous chapter showed how to construct sample spaces of histories for both 
classical and quantum systems. Now we shall see how to use dynamical laws in 
order to assign probabilities to these histories. It is useful to begin with a classical 
random walk of a particle in one dimension, as it provides a helpful guide for 
quantum systems, which are discussed beginning in Sec. 9.3, as well as in the next 
chapter. The sample space of random walks, Sec. 8.2, consists of all sequences of 
the form 


s = Oo, Si, s 2 , . . .Sf), (9.1) 

where sj, an integer in the range 

—M a < sj < M b , (9.2) 

is the position of the particle or random walker at time t = j . 

We shall assume that the dynamical law for the particle’s motion is that when 
the time changes from t to t + 1, the particle can take one step to the left, from s to 
s — 1, with probability p, remain where it is with probability q, or take one step to 
the right, from s to s + 1, with probability r, where 

p+q+r= 1. (9.3) 

The probability for hops in which 5 changes by 2 or more is 0. The endpoints of 
the interval (9.1) are thought of as connected by a periodic boundary condition, so 
that M b is one step to the left of —M a , which in turn is one step to the right of M b . 
The dynamical law can be used to generate a probability distribution on the sample 
space of histories in the following way. We begin by assigning to each history a 
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weight 

woo = n w{s j - s j- ( 9 - 4 ) 

7=1 

where the hopping probabilities 

w(— 1) = p, w(0) = q, w(+\) — r (9.5) 

were introduced earlier, and w(As) — 0 for | As | >2. 

The weights by themselves do not determine a probability. Instead, they must 
be combined with other information, such as the starting point of the particle at 
t = 0, or a probability distribution for this starting point, or perhaps information 
about where the particle is located at some later time(s). This information is not 
contained in the dynamical laws themselves, so we shall refer to it as contingent 
information or initial data. The “initial” in initial data refers to the beginning of 
an argument or calculation, and not necessarily to the earliest time in the random 
walk. The single contingent piece of information “s — 3 at t — 2” can be the initial 
datum used to generate a probability distribution on the space of all histories of the 
form (9.2). Contingent information is also needed for deterministic processes. The 
orbit of the planet Mars can be calculated using the laws of classical mechanics, but 
to get the calculation started one needs to provide its position and velocity at some 
particular time. These data are contingent in the sense that they are not determined 
by the laws of mechanics, but must be obtained from observations. Once they are 
given, the position of Mars can be calculated at earlier as well as later times. 

Contingent information in the case of a random walk is often expressed as a 
probability distribution po(^o) on the coarse sample space of positions at t = 0. (If 
the particle starts at a definite location, the distribution p 0 assigns the value 1 to 
this position and 0 to all others.) The probability distribution on the refined sample 
space of histories is then determined by a refinement rule that says, in essence, 
that for each so, the probability Po(sq) is to be divided up among all the different 
histories which start at this point at t = 0, with history s assigned a fraction of 
Po(sq) proportional to its weight W (s). One could also use a refinement rule if the 
contingent data were in the form of a position or a probability distribution at some 
later time, say t — 2 or t — f, or if positions were given at two or more different 
times. Things are more complicated when probability distributions are specified at 
two or more times. 

In order to turn the refinement rule for a probability distribution at t = 0 into a 
formula, let J(so ) be the set of all histories which begin at ,s' 0 , and 

N(s 0 ) := J2 ( 9 -6) 

ss7(s 0 ) 
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the sum of their weights. The probability of a particular history is then given by 
the formula 


Pr(s) = p 0 (s 0 )W(s)/N(s 0 ). (9.7) 

These probabilities sum to 1 because the initial probabilities po(so) sum to 1, and 
because the weights have been suitably normalized by dividing by the normaliza- 
tion factor N (.s - o). In fact, for the weights defined by (9.4) and (9.5) using hopping 
probabilities which satisfy (9.3), it is not hard to show that the sum in (9.6) is 
equal to 1 , so that in this particular case the normalization can be omitted from 
(9.7). However, it is sometimes convenient to work with weights which are not 
normalized, and then the factor of 1 /N Csq) is needed. 

Suppose the particle starts at sq — 2, so that po(2) = 1. Then the histories (2, 1), 
(2, 2), (2, 3), and (2, 4), which are compound histories for / > 2, have probabili- 
ties p, q, r, and 0, respectively. Likewise, the histories (2, 2, 2), (2, 2, 3), (2, 3, 4), 
(2, 4, 3) have probabilities of q 2 , qr, r 2 , and 0. Any history in which the parti- 
cle hops by a distance of 2 or more in a single time step has zero probability, that 
is, it is impossible. One could reduce the size of the sample space by eliminating 
impossible histories, but in practice it is more convenient to use the larger sample 
space. 

As another example, suppose that po(0) — po(\) — po(2) — 1/3. What is the 
probability that si = 2 at time t = 1? Think of si = 2 as a compound history given 
by the collection of all histories which pass through s — 2 when t = 1 , so that its 
probability is the sum of probabilities of the histories in this collection. Clearly 
histories with zero probability can be ignored, and this leaves only three two-time 
histories: (1, 2), (2, 2), and (3, 2). In the case / = 1, formula (9.7) assigns them 
probabilities r/3, q/ 3, and 0, so the answer to the question is (q + r)/3. This 
answer is also correct for / > 2, but then it is not quite so obvious. The reader 
may find it a useful exercise to work out the case / = 2 , in which there are nine 
histories of nonzero weight passing through 5 = 2 at t = 1 . 

Once probabilities have been assigned on the sample space, one can answer 
questions such as: “What is the probability that the particle was at 5 = 2 at time 
t — 3, given that it arrived at s — 4 at time t = 5?” by means of conditional 
probabilities: 

Pr (s 3 = 2 | s 5 = 4) = Pr[Cs- 3 — 2) A (s 5 = 4)]/Pr(s 5 = 4). (9.8) 

Here the event (53 = 2) A ( S 5 — 4) is the compound history consisting of all el- 
ementary histories which pass through 5 = 2 at time t — 3 and 5 = 4 at time 
t = 5. Such conditional probabilities depend, in general, both on the initial data 
and the weights. However, if a value of sq is one of the conditions, then the condi- 
tional probability does not depend upon po(so) (assuming po(so) > 0 , so that the 
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conditional probability is defined). In particular, 

Pr(s | s 0 = w) = 8 mso W (s) / N (so) . (9.9) 

To obtain similar formulas in other cases, it is convenient to extend the definition 
of weights to include compound histories in the event algebra using the formula 

W(E) = J2 W(s). (9.10) 

seE 

Defining W (E) for the compound event E in this way makes it an “additive set 
function” or “measure” in the sense that if E and F are disjoint (they have no 
elementary histories in common) members of the event algebra of histories, then 

W(E U F) = W(E) + W(F). (9.11) 


Using this extended definition of W, one can, for example, write 


Pr[s 3 = 2 | ( SQ = 1) A (j 5 = 4)] = 


W[(s Q = 1) A (53 — 2) A fe = 4)] 
W[(5 0 = 1)a(5 5 =4)] 


That is, take the total weight of all the histories which satisfy the conditions 5 o = 1 
and 55 = 4, and find what fraction of it corresponds to histories which also have 
s 3 - 2. 


9.2 Single-time probabilities 

The probability that at time t the random walker of Sec. 9.1 will be located at s is 
given by the single-time probability distribution* 

P,(s) = Pr < s )’ (9.13) 

S€ J,(s) 

where the sum is over the collection J,(s ) of all histories which pass through s at 
time t. Because the particle must be somewhere at time t, it follows that 

= I- (9.14) 

It is easy to show that the dynamical law used in Sec. 9.1 implies that p t (s ) 
satisfies the difference equation 

Pt+iis) = pp t (s + 1 ) + q p t (s) + r p t (s - 1). (9.15) 

In particular, if the contingent information is given by a probability distribution at 
t = 0, so that po(s) — po(s), (9.15) can be used to calculate p t (s) at any later 


The term one-dimensional distribution is often used, but in the present context “one-dimensional” would be 
misleading. 
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time t. For example, if the random walker starts off at s — 0 when t — 0, and 
p — q — r — 1/3, then po(0) — 1, while 

p x (-l) = pi(0) = pi(l) = 1/3, 

Pii-2) - Pi( 2) = 1/9, p 2 (-l) = p 2 (l) = 2/9, p 2 (0) = 1/3 

are the nonzero values of p,{s) for t — 1 and 2. 

The single-time distribution p,(s ) is a marginal probability distribution and con- 
tains less information than the full probability distribution Pr(s) on the set of all 
random w a lks. This is so even if one knows p,(s ) for every value of t. In partic- 
ular, p,(s ) does not tell one how the particle’s position is correlated at successive 
times. For example, given Pr(s), one can show that the conditional probability 
Pr(j ?+ i | s t ) is zero whenever | .sy + 1 — .v r | is larger than 1, whereas the values of 
Pi (,v) and p 2 (,y) in (9.16) are consistent with the possibility of the particle hopping 
from s = 1 at t — 1 to s — — 2 at t — 2. It is not a defect of p t (s) that it contains 
less information than the total probability distribution Pr(s). Less detailed descrip- 
tions are often very useful in helping one see the forest and not just the trees. But 
one needs to be aware of the fact that the single-time distribution as a function of 
time is far from being the full story. 

For a Brownian particle the analog of p,(s) for the random walker is the single- 
time probability distribution density p t { r), defined in such a way that the integral 

f Pt(r)dr (9.17) 

Jr 

over a region R in three-dimensional space is the probability that the particle will 
lie in this region at time t . In the simplest theory of Brownian motion, p t (r) satisfies 
a partial differential equation 

dp/dt = DV 2 p, (9.18) 

where D is the diffusion constant and V 2 is the Laplacian. If the particle starts off 
at r = 0 when t — 0, the solution is 

p t (r) = (4 t TDt)- 3/2 e- r2/4Dt , (9.19) 

where r is the magnitude of r. 

Just as for p t (s) in the case of a random walk, p, (r) lacks information about the 
correlation between positions of the Brownian particle at successive times. Sup- 
pose, for example, that a particle starting at r = 0 at time t — 0 is at ri at a time 
t\ > 0. Then at a time t 2 = h + e, where e is small compared to t\ , there is a high 
probability that the particle will still be quite close to ri . This fact is not, however, 
reflected in p, 2 ( r), as (9.19) gives the probability density for the particle to be at r 
using no information beyond the fact that it was at the origin at t = 0. 
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9.3 The Born rule 

As in Ch. 7, we shall consider an isolated system which does not interact with its 
environment, so that one can define unitary time development operators of the form 
T {t ' , t). To describe its stochastic time development one must assign probabilities 
to histories forming a suitable sample space of the type discussed in Sec. 8.5. Just 
as in the case of the random walk considered in Sec. 9.1, these probabilities are 
determined both by the contingent information contained in initial data, and by a 
set of weights. The weights are given by the laws of quantum mechanics, and for 
an isolated system they can be computed using the time development operators. 

In this section we consider a very simple situation in which the initial datum is a 
normalized state IV^o) at time to, and the histories involve only two times, to and a 
later time t\ at which there is a decomposition of the identity corresponding to an 
orthonormal basis {| (p\),k — 1,2,...}. Histories of the form 

Y k = [f 0 ] O [0f], (9.20) 

together with a history 

Z = (/ — M]) O / (9.21) 

constitute a decomposition of the history identity I, and thus a sample space of 
histories based upon the initial state [i/foL to use the terminology of Sec. 8.5. We 
assign initial probabilities po(I — [ V'b I ) = 0 and po([V f ol) = h in the notation of 
Sec. 9.1. 

The Bom rule assigns a weight 

W{Y k ) = \{<f>\\T(t x , toMo)\ 2 (9.22) 

to the history Y k . These weights sum to 1, 

J2 W ) = h)\4>i)(4>i\T(tu toMo) 

k> 0 k 

= (ito\T(t 0 , t\)T(t\, toMo) = (V'dl/IV'd) = (V'tolV'to) = 1, (9.23) 

because |i^o) is normalized and the {|</>f)} are an orthonormal basis. It is important 
to notice that the Born rule does not follow from any other principle of quantum 
mechanics. It is a fundamental postulate or axiom, the same as Schrodinger’s equa- 
tion. The weights can be used to assign probabilities to histories using the obvious 
analog of (9.7), with the normalization N equal to 1 because of (9.23): 

Pr(0f) - Pr(T*) - W(Y k ) - t 0 Mo)\ 2 , (9.24) 

where Pr (</>[), which could also be written as Pr(0* | i/'o), is the probability of 
the event <f>\ at time t\ . The square brackets around 4>\ have been omitted where 
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these dyads appear as arguments of probabilities, since this makes the notation less 
awkward, and there is no risk of confusion. Given an observable of the form 

y Vk[ ri ] = H v k\4>iH4>iU (9.25) 

k k 

one can compute its average, see (5.42), at the time t\ using the probability distri- 
bution Pr(0f): 

(V) = VkPr&i) = Wo\T(f 0 , h)VT(h, toMo). (9.26) 

k 

The validity of the right side becomes obvious when V is replaced by the right side 
of (9.25). 

Let us analyze two simple but instructive examples. Consider a spin-half particle 
in zero magnetic field, so that the spin dynamics is trivial: H = 0 and T(t',t ) = I. 
Let the initial state be 

Wo) = k + )- (9-27) 

For the first example use 

Ml) = |z + >, \<p\) = [Z"> (9-28) 

as the orthonormal basis at t\ . Then (9.24) results in 

Pr(<p\) = Pr(z+) = 1, Pr(0?) = Pr(z") = 0. (9.29) 

We have here an example of a unitary family of histories as defined in Sec. 8.7. 
Since the ket T (t\, fo)hAo) is equal to one of the basis vectors at t\, it is necessarily 
orthogonal to the other basis vector. Thus the unitary history [iAol O l<A| I has 
probability 1, whereas the other history [iAol O \<P‘\\ which begins with [ y/ 0 1 has 
probability 0. It follows from (9.29) that 

(S z ) — 1/2, (9.30) 

where S z — j([z + ] — \z~ |) is the operator for the z-component of spin angular 
momentum in units of h — see (5.30). 

The second example uses the same initial state (9.27), but at t\ an orthonormal 
basis 

l^i 1 ) = l* + >, 1 4>l) = k“>. (9.31) 

where bars have been added to distinguish these kets from those in (9.28). A 
straightforward calculation yields 

Pr(x + ) = 1/2 = Pr(x“). (9.32) 

Stated in words, if S z — *f 1/2 at to, the probability is 1/2 that S x — +1/2 at h. 
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and 1 /2 that S x = — 1/2. Consequently, the average of the x -component of angular 
momentum is 


( S x ) = 0. (9.33) 

The second example may seem counterintuitive for the following reason. The 
unitary quantum dynamics is trivial: nothing at all is happening to this spin-half 
particle. It is not in a magnetic field, and therefore there is no reason why the spin 
should precess. Nonetheless, it might seem as if the spin orientation has managed 
to “jump” from being along the positive z axis at time to to an orientation either 
along or opposite to the positive x axis at t\ . However, the idea that something is 
“jumping” comes from a misleading mental picture of a spin-half particle. To better 
understand the situation, imagine a classical object spinning in free space and not 
subject to any torques, so that its angular momentum is conserved. Suppose we 
know the z-component of its angular momentum at to, and for some reason want to 
discuss the x-component at a later time t \ . The fact that two different components 
of angular momentum are considered at the two different times does not mean 
there has been a change in the angular momentum of the object between f 0 and 
t \ . This analogy, like all classical analogies, is far from perfect, but in the present 
context it is less misleading than thinking of S- = +1/2 for a spin-half particle 
as corresponding to a classical object with its total angular momentum in the +z 
direction. Applying this analogy to the quantum case, we see that the probabilities 
in (9.32) are not unreasonable, given that we have adopted a sample space in which 
values of S x occur at t\, rather than values of S z , as in the first example. 

The odd thing about quantum theory is the fact that one cannot combine the con- 
clusions in (9.29) and (9.32) to form a single description of the time development 
of the particle, whereas it would be perfectly reasonable to do so for a classical 
spinning object. It is incorrect to conclude from (9.29) and (9.32) that at t\ either it 
is the case that S z — +1/2 AND S x — +1/2, or else it is the case that S z = +1/2 
AND S x = — 1 /2. Both of the statements connected by AND are quantum non- 
sense, as they do not correspond to anything in the quantum Hilbert space; see 
Sec. 4.6. For the same reason the two averages (9.30) and (9.33) cannot be thought 
of as applying simultaneously to the same system, since the observables S z and 
S x do not commute with each other, and hence correspond to incompatible sample 
spaces. It is always possible to apply the Bom rule in a large number of different 
ways by using different orthonormal bases at t\, but these different results cannot 
be combined in a single sensible quantum description of the system. Attempting to 
do so violates the single-framework rule (to be discussed in Sec. 16.1) and leads to 
confusion. 

The Bom rule is often discussed in the context of measurements, as a formula to 
compute the probabilities of various outcomes of a measurement carried out by an 
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apparatus A on a system S. Hence it is worth emphasizing that the probabilities 
in (9.24) refer to an isolated system S which is not interacting with a separate 
measurement device. Indeed, our discussion of the Born rule has made no reference 
whatsoever to measurements of any sort. Measurements will be taken up in Chs. 17 
and 18, where the usual formulas for the probabilities of different measurement 
outcomes will be derived by applying general quantum principles to the combined 
apparatus and measured system thought of as constituting a single, isolated system. 


9.4 Wave function as a pre-probability 

The basic formula (9.24) which expresses the Born rule can be rewritten in various 
ways. One rather common form is the following. Let 

|^> = 7-(/,,/o)M>> (9.34) 

be the wave function obtained by integrating Schrodinger’s equation from to to t\ . 
Then (9.24) can be written in the compact form 

Pr(*?) = (9.35) 

Note that | r// ] ) or [ yjr\ ], regarded as a quantum property at time t \ , is incompatible 
with the collection of properties {[0* ]} if at least two of the probabilities in (9.35) 
are nonzero, that is, if one is not dealing with a unitary family. Thus in the second 
spin-half example considered above, |i/q) = |z + ) is incompatible with both |x + ) 
and \x~). Therefore, in the context of the family based on (9.20) and (9.21) it does 
not make sense to suppose that at t\ the system possesses the physical property 
|i/q). Instead, \ifi) must be thought of as a mathematical construct suitable for 
calculating certain probabilities. We shall refer to IVq) understood in this way as 
a pre-probability, since it is (obviously) not a probability, nor a property of the 
physical system, but instead something which is used to calculate probabilities. In 
addition to wave functions obtained by unitary time development, density matrices 
are often employed in quantum theory as pre-probabilities; see Ch. 15. The pre- 
probability |Vq) is very convenient for calculations because it does not depend 
upon which orthonormal basis is employed at t\. The theoretical physicist 
may want to compute probabilities for various different bases, that is, for various 
different families of histories, and | xfi) is a convenient tool for doing this. There is 
no harm in carrying out such calculations as long as one does not try to combine 
the results for incompatible bases into a single description of the quantum system. 

Another way to see that IVq) on the right side of (9.35) is a calculational device 
and not a physical property is to note that these probabilities can be computed 
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equally well by an alternative procedure. For each k, let 

I0o) - T(t 0 ,hMi) (9-36) 

be the ket obtained by integrating Schrodinger’s equation backwards in time from 
the final state It is then obvious, see (9.24), that 

Pr(</>i) = I (0o I T^o) | 2 * (9.37) 

There is no reason in principle to prefer (9.35) to (9.37) as a method of calculating 
these probabilities, and in fact there are a lot of other methods of obtaining the 
same answer. For example, one can integrate |i^o) forwards in time and each \cj>\) 
backwards in time until they meet at some intermediate time, and then evaluate the 
absolute square of the inner product. To be sure, the most efficient procedure for 
calculating Pr((p k | i/y 0 ) for all values of k is likely to be (9.35): one only has to do 
one time integration, and then evaluate a number of inner products. But the fact 
that other procedures are equally valid, and can give very different “pictures” of 
what is going on at intermediate times if one takes them literally, is a warning that 
one has no more justification for identifying | \jr{), as defined in (9.34), as “the real 
state of the system” at time t\ than one has for identifying one or more of the \4>q), 
as defined in (9.36), with “the real state of the system” at time to- Instead, both 
|i/q) and the \<Pq) are functioning as pre-probabilities. 

It is evident from (9.26) and (9.34) that the average of an observable V at time 
t\ can be written in the compact and convenient form 

(v) = mvm, (9.38) 

where \r[/\ } is again functioning as a pre-probability. A similar expression holds 
for any other observable W, and there is no harm in simultaneously calculating 
averages for (V), {W) provided one keeps in mind the fact that when V and W 
do not commute with each other, one cannot regard (V) and (IT) as belonging 
to a single (stochastic) description of a quantum system, for the two averages are 
necessarily based on incompatible sample spaces that cannot be combined. See the 
comments towards the end of Sec. 9.3 in connection with the example of a spin-half 
particle. Any time the symbol (V) is used with reference to the physical properties 
of a quantum system there is an implicit reference to a sample space, and ignoring 
this fact can lead to serious misunderstanding. 

It is important to remember when applying the Born formula that a family of 
histories involving two times tells us nothing at all about what happens at inter- 
mediate times. Such times can, of course, be introduced formally by extending the 
history, in the manner indicated in Sec. 8.4, 

Y k - Wo] O [0f] = Wq] O / O / O • • • / O [0f], 


(9.39) 
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for as many intermediate times as one wants. But each I at the intermediate time 
tells us nothing at all about what actually happens at this time. Imagine being out- 
doors on a dark night during a thunder storm. Each time the lightning flashes you 
can see the world around you. Between flashes, you cannot tell what is going on. 
To be sure, if we are curious about what is going on at intermediate times in a quan- 
tum history of the form (9.39), we can refine the history in the manner indicated in 
Sec. 8.6, by writing the projector as a sum of history projectors which include non- 
trivial information about the intermediate times, and then compute probabilities for 
these different possibilities. That, however, cannot be done by means of the Bom 
formula (9.22), and requires an extension of this formula which will be introduced 
in the next chapter. 

A similar restriction applies to a wave function understood as a pre-probability. 
Even if 


m = T(t,toMo) (9.40) 

is known for all values of the time t, it can only be used to compute probabilities 
of histories involving just two times, to and t. These probabilities are the quan- 
tum analogs of the single-time probabilities p t (r) for a classical Brownian particle 
which started off at a definite location at the initial time to- As discussed in Sec. 9.2, 
p t (r) does not contain probabilistic information about correlations between particle 
positions at intermediate times, and in the same way correlations between quantum 
properties at different times cannot be computed from \ Instead, one must use 
the procedures discussed in the next chapter. 


9.5 Application: Alpha decay 

A toy model of alpha decay was introduced in Sec. 7.4, see Fig. 7.2, as an example 
of unitary time evolution. In this section we shall apply the Born formula in order 
to calculate some of the associated probabilities, but before doing so it will be 
convenient to add a toy detector of the sort shown in Fig. 7.1, in order to detect the 
alpha particle after it leaves the nucleus, see Fig. 9. 1 . Fet M. be the Hilbert space 
of the particle, and A f that of the detector. For the combined system M. 0 fif we 
define the time development operator to be 

T = S a R, (9.41) 

where S a is defined in (7.56) and R in (7.53). Note the similarity with (7.52), which 
means that the discussion of the operation of the detector found in Sec. 7.4, see 
Fig. 7.1, applies to the arrangement in Fig. 9.1, with a few obvious modifications. 
Assume that at t = 0 the alpha particle is at m = 0 inside the nucleus, which has 
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n = 1 


n = 0 



-1 -2 -3 -4 


Fig. 9.1. Toy model of alpha decay (Fig. 7.2) plus a detector. 


not yet decayed, and the detector is in its ready state n = 0, so the wave function 
for the total system is 


l*o) = 1 m = 0> 0 \n = 0> : 

= 10, 0>. 

(9.42) 

Unitary time evolution using (9.41) results in 

S’ 

0 

o 

+ 

\a>t) 0 I I ), 

(9.43) 

where 

1x0 — a l0> + /f|l>, 

N) = o, 


|X2> = « 2 |0> + «>0|1> + yS|2>, 

\coi) = 0, 

(9.44) 

| X 3> = a 3 |0} + a 2 £|l>+a/J|2>, 




and for t > 4 

\ Xt ) =a f |0>+a f - 1 £|l> + a'- 2 £|2>, 

\co t )=a t - i p\'S)+oc , ~ A m + ---m- 

Let us apply the Born rule with t 0 = 0,t\ = t for some integer t > 0, using | d'o) 
as the initial state at time to, and at time t\ the orthonormal basis {| m, n)}, in which 
the alpha particle has a definite position m and the detector either has or has not 


detected the particle. The joint probability distribution of m and n at time t, 

p t (m,n):=Pr([m,n] t ), (9.46) 

is easily computed by regarding |'P / ) in (9.43) as a pre-probability: p t (m, n ) is the 
absolute square of the coefficient of \m) in |/ f ) if n — 0, or in \u> t ) if n — 1. These 
probabilities vanish except for the cases 

p f (0,0) =e~ tlx , (9.47) 

p t (m, 0) = Ke~ (, ~ m)/Z for m = 1, 2 and m < t, (9.48) 

p t (m, 1) = /ce -(f-m)/T for 3 <m <t. (9.49) 
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The positive constants k and r are defined by 

e l/r = \a\\ k = \p\ 2 = 1 - |a| 2 . (9.50) 

The probabilities in (9.47)-(9.50) make good physical sense. The probability 
(9.47) that the alpha particle is still in the nucleus decreases exponentially with 
time, in agreement with the well-known exponential decay law for radioactive nu- 
clei. That p t (m,n ) vanishes for m larger than t reflects the fact that the alpha 
particle was (by assumption) inside the nucleus at t = 0 and, since it hops at most 
one step during any time interval, cannot arrive at m earlier than t — m. Finally, if 
the alpha particle is at m = 0, 1 or 2, the detector is still in its ready state n = 0, 
whereas for m — 3 or larger the detector will be in the state n = 1, indicating 
that it has detected the particle. This is just what one would expect for a detector 
designed to detect the particle as it hops from m — 2 to m — 3 (see the discussion 
in Sec. 7.4). 

It is worth emphasizing once again that p t (m, n ) is the quantum analog of the 
single-time probability p,(s) for the random walk discussed in Sec. 9.2. The reason 
is that the histories to which the Born rule applies involve only two times, to and t\ 
in the notation of Sec. 9.3, and thus no information is available as to what happens 
between these times. Consequently, just as p,(s ) does not tell us all there is to be 
said about the stochastic behavior of a random walker, there is also more to the 
story of (toy) alpha decay and its detection than is contained in p, (m, n). However, 
providing a more detailed description of what is going on requires the additional 
mathematical tools introduced in the next chapter, and we shall return to the prob- 
lem of alpha decay using more sophisticated methods (and a better detector) in 
Sec. 12.4. 

It is not necessary to employ the basis {| m, n )} in order to apply the Bom rule; 
one could use any other orthonormal basis of M. 0 M, and there are many possi- 
bilities. However, the physical properties which can be described by the resulting 
probabilities depend upon which basis is used, and not every choice of basis at time 
t (an example will be considered in the next section) allows one to say whether 
n — 0 or 1, that is, whether the detector has detected the particle. It is customary 
to use the term pointer basis for an orthonormal basis, or more generally a decom- 
position of the identity such as employed in the generalized Born rule defined in 
Sec. 10.3, that allows one to discuss the outcomes of a measurement in a sensible 
way. (The term arises from a mental picture of a measuring device equipped with 
a visible pointer whose position indicates the outcome after the measurement is 
over.) Thus {| m,n)} is a pointer basis, but so is any basis of the form {| %■> ,n)}, 
where {|§ ; }}, j — 1, 2, . . . , is some orthonormal basis of M. While quantum 
calculations which are to be compared with experiments usually employ a pointer 
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basis for calculating probabilities, for obvious reasons, there is no fundamental 
principle of quantum theory which restricts the Born rule to bases of this type. 


9.6 Schrodinger’s cat 

What is the physical significance of the state 1 4> f ) which evolves unitarily from | 4>o) 
in the toy model discussed in the preceding section? This is a difficult question to 
answer, because for t > 3 |4> ( ) is of the form | A) + \ B), (9.43), where |A) = 
\Xt) 0 | 0 ) has the significance that the alpha particle is inside or very close to the 
nucleus and the detector is ready, whereas | B) — \co t ) 0 1 1 ) means that the detector 
has triggered and the nucleus has decayed. What can be the significance of a linear 
combination |A) + \B) of states with quite distinct physical meanings? Could it 
signify that the detector both has and has not detected the particle? 

The difficulty of interpreting such wave functions is often referred to as the prob- 
lem or paradox of Schrodinger’s cat. In a famous paper Schrodinger pointed out 
that in the case of alpha decay, unitary time evolution applied to the system consist- 
ing of a decaying nucleus plus a detector will quite generally lead to a superposition 
state 1 5) = |A) + 1 6), where the (macroscopic) detector either has, state 1 6), or 
has not, state |A), detected the alpha particle. To dramatize the conceptual diffi- 
culty Schrodinger imagined the detector hooked up to a device which would kill a 
live cat upon detection of an alpha particle, thus raising the problem of interpreting 
|A) + | B) when |A) corresponds to an undecayed nucleus, untriggered detector, 
and live cat, and |B) to a nucleus which has decayed, a triggered detector, and a 
dead cat. We shall call |A) + |B) a macroscopic quantum superposition or MQS 
state when | A) and \B) correspond to situations which are macroscopically distinct, 
and use the same terminology for a superposition of three or more macroscopically 
distinct states. In the literature MQS states are often called Schrodinger cat states. 

Rather than addressing the general problem of MQS states, let us return to the 
toy model with its toy example of such a state and, to be specific, consider |4^ 5 ) 
at t = 5 under the assumption that a and have been chosen so that (xslxs) ar >d 
(rz> 5 1 ct> 5 > are of the same order of magnitude, which will prevent us from escaping 
the problem of interpretation by supposing that either |A) or |B) is very small 
and can be ignored. It is easy to show that [ 4 * 5 ] = 1 4 ^ 5 ) ( 4 r 3 1 does not commute 
with either of the projectors [n = 0] or [n = 1]. Nor does it commute with a 
projector [n], where | h) is some linear combination of | n — 0) and | n — 1). This 
means that it makes no sense to say that the combined system has the property 
[4> 5 ], whatever its physical significance might be, while at the very same time the 
detector has or has not detected the particle (or has some other physical property). 
See the discussion in Sec. 4.6. Saying that the system is in the state [4^] and 
then ascribing a property to the detector is no more meaningful than assigning 
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simultaneous values to S x and S z for a spin-half particle. The converse is also true: 
if it makes sense (using an appropriate quantum description) to say that the detector 
is either ready or triggered at t = 5, then one cannot say that the combined system 
has the property ['I's], because that would be nonsense. 

Note that these considerations cause no problem for the analysis in Sec. 9.5, 
because in applying the Bom rule to the basis {| m, n)}, |^) is employed as a pre- 
probability, Sec. 9.4, a convenient mathematical tool for calculating probabilities 
which could also be computed by other methods. When it is used in this way there 
is obviously no need to ascribe some physical significance to |'I' 5 ), nor is there 
any motivation for doing so, since [^ 5 ] must in any case be excluded from any 
meaningful quantum description based upon {| m, n) }. 

Very similar considerations apply to the situation considered by Schrodinger, 
although analyzing it carefully requires a model of macroscopic measurement, see 
Secs. 17.3 and 17.4. The question of whether the cat is dead or alive can be ad- 
dressed by using the Bom rule with an appropriate pointer basis (as defined at the 
end of Sec. 9.5), and one never has to give a physical interpretation to Schrodinger’s 
MQS state | S), since it only enters the calculations as a pre-probability. In any case, 
treating [S] as a physical property is meaningless when one uses a pointer basis. 
To be sure, this does not prevent one from asking whether |S) by itself has some 
intuitive physical meaning. What the preceding discussion shows is that whatever 
that meaning may be, it cannot possibly have anything to do with whether the cat 
is dead or alive, as these properties will be incompatible with [5]. Indeed, it is 
probably the case that the very concept of a “cat” (small furry animal, etc.) cannot 
be meaningfully formulated in a way which is compatible with [5]. 

Quite apart from MQS states, it is in general a mistake to associate a physical 
meaning with a linear combination |C) = | A) ± | B ) by referring to the properties of 
the separate states |A) and \B). For example, the state |x+) for a spin-half particle 
is a linear combination of |z + ) and |z“), but its physical signficance of S x — 1/2 
is unrelated to S z — ±1/2. For another example, see the discussion of (2.27) in 
Sec. 2.5. In addition, there is the problem that for a given |C) = |A) + |B), the 
choice of |A) and \B) is far from unique. Think of an ordinary vector in three 
dimensions: there are lots of ways of writing it as the sum of two other vectors, 
even if one requires that these be mutually perpendicular, corresponding to the not- 
unreasonable orthogonality condition (A|B) = 0. But if | C) is equal to | A') + | B') 
as well as to |A) + |B), why base a physical interpretation upon |A) rather than 
| A')? See the discussion of (2.28) in Sec. 2.5. 

Returning once again to the toy model, it is worth emphasizing that |'k 5 ) is a per- 
fectly good element of the Hilbert space, and enters fundamental quantum theory 
on precisely the same footing as all other states, despite our difficulty in assigning 
it a simple intuitive meaning. In particular, we can choose an orthonormal basis at 
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t\ — 5 which contains |^h 5 > as one of its members, and apply the Bom rule. The 
result is that a weight of 1 is assigned to the unitary history ['I'ol O [^ 5 ], and a 
weight of 0 to all other histories in the family with initial state ['I'ol- This means 
that the state [ 'Tj ] will certainly occur (probability 1) at t = 5 given the initial state 
['Tq] at t = 0 . 

But if |'T 5 | occurs with certainty, how is it possible for there to be a differ- 
ent quantum description in which [n = 0 ] occurs with a finite probability, when 
we know the two properties ['I / 5 ] and [n = 0 ] cannot consistently enter the same 
quantum description at the same time? The brief answer is that quantum proba- 
bilities only have meaning within specific families, and those from incompatible 
families — the term will be defined in Sec. 10.4, but we have here a particular 
instance — cannot be combined. Going beyond the brief answer to a more detailed 
discussion requires the material in the next chapter and its application to some ad- 
ditional examples. The general principle which emerges is called the single-family 
or single-framework rule, and is discussed in Sec. 16.1. 
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10.1 Chain operators and weights 

The previous chapter showed how the Bom rule can be used to assign probabilities 
to a sample space of histories based upon an initial state |$q) at to, and an orthonor- 
mal basis {|0“>} of the Hilbert space at a later time t\ . In this chapter we show how 
an extension of the Born rule can be used to assign probabilities to much more 
general families of histories, including histories defined at an arbitrary number of 
different times and using decompositions of the identity which are not limited to 
pure states, provided certain consistency conditions are satisfied. 

We begin by rewriting the Bom weight (9.22) for the history 

Y a = [f o ]O[0“] (10.1) 

in the following way: 

W(Y a ) = \(^\T(tutoMo )\ 2 = (xlro\T(to,tiM a i)(<l>i\T(ti,toMo) 

= Tr([^ 0 ] T(t 0 , h) [ 0 f] T(h, t 0 )) = Tr(V(y“)^(y“)), (10.2) 

where the chain operator K(Y ) and its adjoint are given by the expressions 

K(F 0 © F{) = FiTih, to)F 0 , K\F 0 © F\) = F 0 T(t 0 , h)F l (10.3) 

in the case of a history Y — FoO F\ involving just two times; recall that T (to, ?i) = 
T^(t\, to). The steps from the left side to the right side of (10.2) are straightforward 
but not trivial, and the reader may wish to work through them. Recall that if \x/r) 
is any normalized ket, [ i// 1 = |i/^)(V / 'l > s the projector onto the one-dimensional 
subspace containing \yjr), and (x/r\A\xjr) is equal to Tr(|^)(^jA) = Tr([i/f]A). 

For a general history of the form 

Y = F 0 © Fi 0 F 2 0 • • • F f (10.4) 
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with events at times to < h < h < ■ ■ ■ tf the chain operator is defined as 

K(Y) = F f T (t f , tf-AFf-tTitf-u t f -2) • • • T(t u to)F 0 , (10.5) 

and its adjoint is given by the expression 

K\Y) = FoTito, t\)F\T(t\, h) ■ ■ ■ T(t f _ x , t f )F f . (10.6) 

Notice that the adjoint is formed by replacing each O in (10.4) separating Fj from 
Fj + 1 by T ( tj , tj+ 1 ). In both K and K '\ each argument of any given T is adjacent 
to a projector representing an event at this particular time. One could just as well 
define K(Y) using (10.6) and its adjoint K"(Y) using (10.5). The definition used 
here is slightly more convenient for some purposes, but either convention yields 
exactly the same expressions for weights and consistency conditions, so there is no 
compelling reason to employ one rather than the other. Note that Y is an operator 
on the history Hilbert space 7 i, and K (T) an operator on the original Hilbert space 
77. Operators of these two types should not be confused with one another. 

The definition of K (Y) in (10.5) makes good sense when the Fj in (10.4) are any 
operators on the Hilbert space, not just projectors. In addition, K can be extended 
by linearity to sums of tensor product operators of the type (10.4): 

K(Y’ + Y" + Y w + •••) = K{Y') + K{Y") + K(Y'") + ■■■ . (10.7) 

In this way, K becomes a linear map of operators on the history space 7 7 to opera- 
tors on the Hilbert space 77 of a system at a single time, and it is sometimes useful 
to employ this extended definition. 

The sequence of operators which make up the “chain” on the right side of (10.6) 
is in the same order as the sequence of times to < h < ■■■?/• This is important; 
one does not get the same answer (in general) if the order is different. Thus for 
f — 2, with to <t\ < t 2 , the operator defined by (10.6) is different from 

L\Y) = FoTito, t 2 )F 2 T(t 2 , t\)F\, (10.8) 

and it is K(Y) not L(Y) which yields physically sensible results. 

When all the projectors in a history are onto pure states, the chain operator has 
a particularly simple form when written in terms of dyads. For example, if 

r = mm o \4>l)(<h \ o \o>2){co 2 \, (10.9) 

then the chain operator 

K{Y) = (co 2 \T(t 2 , h) |0O • (4>i\T(ti, t 0 )m ■ \(o 2 )m (10.10) 

is a product of complex numbers, often called transition amplitudes, with a dyad 
\o) 2 ) ( t/t 0 | formed in an obvious way from the first and last projectors in the history. 
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Given any projector 7 on the history space H, we assign to it a nonnegative 
weight 

17(7) = Tr[A' t (7)A'(7)] = (K(Y), K(Y)), (10.11) 

where the angular brackets on the right side denote an operator inner product 
whose general definition is 

(A, B) Tr[A^5], (10.12) 

with A and B any two operators on TL. In an infinite-dimensional space the for- 
mula (10.12) does not always make sense, since the trace of an operator is not 
defined if one cannot write it as a convergent sum. Technical issues can be avoided 
by restricting oneself to a finite-dimensional Hilbert space, where the trace is al- 
ways defined, or to operators on infinite-dimensional spaces for which (10.12) does 
makes sense. 

Operators on a Hilbert space H form a linear vector space under addition and 
multiplication by (complex) scalars. If H is n-dimensional, its operators form an 
n 2 -dimensional Hilbert space if one uses (10.12) to define the inner product. This 
inner product has all the usual properties: it is antilinear in its left argument, linear 
in its right argument, and satisfies: 

(A,B)* = (B,A), {A, A) > 0, (10.13) 

with (A, A) = 0 only if A = 0; see (3.92). Consequently, the weight 17(7) 
defined by (10.11) is a nonnegative real number, and it is zero if and only if the 
chain operator K (7) is zero. If one writes the operators as matrices using some 
fixed orthonormal basis of H, one can think of them as n 2 -component vectors, 
where each matrix element is one of the components of the vector. Addition of 
operators and multiplying an operator by a scalar then follow the same rules as for 
vectors, and the same is true of inner products. In particular, (A, A) is the sum of 
the absolute squares of the n 2 matrix elements of A. 

If (A, B) = 0, we shall say that the operators A and B are (mutually) orthogonal. 
Just as in the case of vectors in the Hilbert space, (A, B) — 0 implies (B, A) = 0, 
so orthogonality is a symmetrical relationship between A and B. Earlier we intro- 
duced a different definition of operator orthogonality by saying that two projectors 
P and Q are orthogonal if and only if P Q = 0. Fortunately, the new definition of 
orthogonality is an extension of the earlier one: if P and Q are projectors, they are 
also positive operators, and the argument following (3.93) in Sec. 3.9 shows that 
Tr[P 0] = 0 if and only if P 0 = 0. 

It is possible to have a history with a nonvanishing projector 7 for which K(Y) = 
0. These histories (and only these histories) have zero weight. We shall say that 
they are dynamically impossible. They never occur, because they have probability 
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zero. For example, |z + 1 © \z ] for a spin-half particle is dynamically impossible 
in the case of trivial dynamics, T(t',t) — I. 


10.2 Consistency conditions and consistent families 

Classical weights of the sort used to assign probabilities in stochastic processes 
such as a random walk, see Sec. 9.1, have the property that they are additive func- 
tions on the sample space: if E and F are two disjoint collections of histories from 
the sample space, then, as in (9.11), 

W{E U F) = W(E) + W(F). (10.14) 

If quantum weights are to function the same way as classical weights, they too must 
satisfy (10.14), or its quantum analog. Suppose that our sample space of histories is 
a decomposition {7“} of the history identity. Any projector Y in the corresponding 
Boolean algebra can be written as 

Y = J2 7T °‘Y a ’ (10.15) 

where each n 01 is 0 or 1. Additivity of W then corresponds to 

W(Y) = ^it a W(Y a ). (10.16) 

However, the weights defined using (10.11) do not, in general, satisfy (10.16). 
Since the chain operator is a linear map, (10.15) implies that 

K(Y) = ^jr a K(Y a ). (10.17) 

If we insert this in (10.11), and use the (anti) linearity of the operator inner product 
(note that the n a are real), the result is 

W(Y) = J2J2 7ta7T ^K(Y a ),K(Y fi )), (10.18) 

a P 

whereas the right side of (10.16) is given by 

X> a W(Y a ) = Y j n a (K(Y a ), K(Y a )). (10.19) 

In general, (10.18) and (10.19) will be different. However, in the case in which 

{K(Y a ),K(Y p )) = 0fora ^ p, (10.20) 

only the diagonal terms a — p remain in the sum (10.18), so it is the same as 
(10.19), and the additivity condition (10.16) will be satisfied. Thus a sufficient 
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condition for the quantum weights to be additive is that the chain operators asso- 
ciated with the different histories in the sample space be mutually orthogonal in 
terms of the inner product defined in (10.12). The approach we shall adopt is to 
limit ourselves to sample spaces of quantum histories for which the equalities in 

(10.20) , known as consistency conditions, are fulfilled. Such sample spaces, or 
the corresponding Boolean algebras, will be referred to as consistent families of 
histories, or frameworks. 

The consistency conditions in (10.20) are also called “decoherence conditions”, 
and the terms “decoherent family”, “consistent set”, and “decoherent set” are some- 
times used to denote a consistent family or framework. The adjective “consistent”, 
as we have defined it, applies to families of histories, and not to individual his- 
tories. However, a single history Y can be said to be inconsistent if there is no 
consistent family which contains it as one of the members of its Boolean algebra. 
For an example, see Sec. 11.8. 

A consequence of the consistency conditions is the following. Let Y and Y 
be any two history projectors belonging to the Boolean algebra generated by the 
decomposition {T“}. Then 

YY — 0 implies {K(Y), K(Y)) = 0. (10.21) 

To see that this is true, write Y and Y in the form (10.15), using coefficients n a for 
Y. Then 

YY = ^2n a 7T cl Y cl , (10.22) 

so YY = 0 implies that 

7t a 7t a = 0 (10.23) 

for every a. Next use the expansion (10.17) for both K (Y) and K (7), so that 

(K(Y), K(Y)) = 7T a ji p (K(Y a ), K(Y P )). (10.24) 

ap 

The consistency conditions (10.20) eliminate the terms with a ^ from the sum, 
and (10.23) eliminates those with a — so one arrives at (10.21). On the other 
hand, (10.21) implies (10.20) as a special case, since two different projectors in 
the decomposition {T“} are necessarily orthogonal to each other. Consequently, 

(10.20) and (10.21) are equivalent, and either one can serve as a definition of a 
consistent family. 

While (10.20) is sufficient to ensure the additivity of W, (10.16), it is by no 
means necessary. It suffices to have 

Re[(^(y“), K(Y p ))] = 0 for a ^ 0, 


(10.25) 
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where Re denotes the real part. We shall refer to these as “weak consistency con- 
ditions”. Even weaker conditions may work in certain cases. The subject has not 
been exhaustively studied. However, the conditions in (10.20) are easier to apply 
in actual calculations than are any of the weaker conditions, and seem adequate 
to cover all situations of physical interest which have been studied up till now. 
Consequently, we shall refer to them from now on as “the consistency conditions”, 
while leaving open the possibility that further study may indicate the need to use a 
weaker condition that enlarges the class of consistent f ami lies. 

What about sample spaces for which the consistency condition (10.20) is not sat- 
isfied? What shall be our attitude towards inconsistent families of histories? Within 
the consistent history approach to quantum theory such families are “meaningless” 
in the sense that there is no way to assign them probabilities within the context of 
a stochastic time development governed by the laws of quantum dynamics. This is 
not the first time we have encountered something which is “meaningless” within 
a quantum formalism. In the usual Hilbert space formulation of quantum theory, 
it makes sense to describe a spin-half particle as having its angular momentum 
along the +z axis, or along the +x axis, but trying to combine these two descrip- 
tions using “and” leads to something which lacks any meaning, because it does not 
correspond to any subspace in the quantum Hilbert space. See the discussion in 
Sec. 4.6. Consistency, on the other hand, is a more stringent condition, because a 
family of histories corresponding to an acceptable Boolean algebra of projectors 
on the history Hilbert space may still fail to satisfy the consistency conditions. 

Consistency is always something which is relative to dynamical laws. As will 
be seen in an example in Sec. 10.3, changing the dynamics can render a consistent 
family inconsistent, or vice versa. Note that the conditions in (10.20) refer to an 
isolated quantum system. If a system is not isolated and is interacting with its en- 
vironment, one must apply the consistency conditions to the system together with 
its environment, regarding the combination as an isolated system. A consistent 
family of histories for a system isolated from its environment may turn out to be 
inconsistent if interactions with the environment are “turned on.” Conversely, in- 
teractions with the environment can sometimes ensure the consistency of a family 
of histories which would be inconsistent were the system isolated. Environmen- 
tal effects go under the general heading of decoherence. (The term does not refer 
to the same thing as “decoherent” in “decoherent histories”, though the two are 
related, and this sometimes causes confusion.) Decoherence is an active field of 
research, and while there has been considerable progress, there is much that is still 
not well understood. A brief introduction to the subject will be found in Ch. 26. 

Must the orthogonality conditions in (10.20) be satisfied exactly, or should one 
allow small deviations from consistency? Inasmuch as the consistency conditions 
form part of the axiomatic structure of quantum theory, in the same sense as the 
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Bom rule discussed in the previous chapter, it is natural to require that they be 
satisfied exactly. On the other hand, as first pointed out by Dowker and Kent, it 
is plausible that when the off-diagonal terms ( K(Y a ), K(Y ?)) in (10.24) are small 
compared with the diagonal terms (K (7“), K (Y a )), one can find a “nearby” family 
of histories in which the consistency conditions are satisfied exactly. A nearby 
family is one in which the original projectors used to define the events (properties 
at a particular time) making up the histories in the family are replaced by projectors 
onto nearby subspaces of the same dimension. For example, consider a projector 
[0] onto the subspace spanned by a normalized ket The subspace [/ 1 spanned 
by a second normalized ket |/) can be said to be near to \f\ provided \(x\4’)\ 2 ' s 
close to 1 ; that is, if the angle e defined by 

sin 2 (e) = 1 - | ( X \4>) | 2 = § Tr [([*] - [0]) 2 ] (10.26) 

is small. Notice that this measure is left unchanged by unitary time evolution: if | x ) 
is near to |0) then T (f , t) |x) is near to T ft' , t)\(f>). For example, if |</>) corresponds 
to S z = +1/2 for a spin-half particle, then a nearby |x) would correspond to 
S w = +1/2 for a direction w close to the positive z axis. Or if fix) is a wave 
packet in one dimension, x (x) might be the wave packet with its tails cut off, and 
then normalized. Of course, the histories in the nearby family are not the same 
as those in the original family. Nonetheless, since the subspaces which define the 
events are close to the original subspaces, their physical interpretation will be rather 
similar. In that case one would not commit a serious error by ignoring a small lack 
of consistency in the original family. 


10.3 Examples of consistent and inconsistent families 

As a first example, consider the family of two-time histories 

Y k = W ol O [0f], 2 = (/ - [0 O J) O I (10.27) 

used in Sec. 9.3 when discussing the Bom rule. The chain operators 

K(Y k ) = [tf]T«ut o)[^ol = WftntutoMo) • |0f><0ol (10.28) 

are mutually orthogonal because 

(K(Y k ), K(Y 1 )) a Tr(j^oX^I^)(^ol) = Wo\fo) (4> k M) (10.29) 

is zero for k ^ l. To complete the argument, note that 

(K(Y k ), K(Z)) .= Tr ([^ 0 ]T(to, h)[(/> k ]T (t u t 0 )(I ~ [0o])) (10.30) 

is zero, because one can cycle [0ol from the beginning to the end of the trace, and 
its product with (I — [ t/zq I ) vanishes. Consequently, all the histories discussed in 
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Ch. 9 are consistent, which justifies our having omitted any discussion of consis- 
tency when introducing the Bom rule. 

The same argument works if we consider a more general situation in which the 
initial state is a projector d'y onto a subspace which could have a dimension greater 
than 1, and instead of an orthonormal basis we consider a general decomposition 
of the identity in projectors 


I = J2 pk (10.31) 

k 

at time t \ . The family of two-time histories 

Y k = 'To © P k , Z = (I - vk 0 ) O / (10.32) 

is again consistent, since for k ^ l 

(- K(Y k ), K(Y 1 )) oc Tr (V 0 T(t 0 , h)P k P l T(t u t 0 )^o) = 0 (10.33) 

because P k P l — 0, while ( K(Y k ), Z) — 0 follows, as in (10.30), from cycling 
operators inside the trace. (This argument is a special case of the general result in 
Sec. 11.3 that any family based on just two times is automatically consistent.) The 
probability of Y k is given by 

Pr(T*) = Tr (P k T(t u t 0 )^oT(t 0 , h)) / Tr (* 0 ) , (10.34) 

which we shall refer to as the generalized Born rule. The factor of 1/ Tr('I'o) is 
needed to norm a li z e the probability when vk 0 projects onto a space of dimension 
greater than 1. 

Another situation in which the consistency conditions are automatically satis- 
fied is that of a unitary family as defined in Sec. 8.7. For a given initial state such a 
family contains one unitary history, (8.41), obtained by unitary time development 
of this initial state, and various nonunitary histories, such as (8.42). It is straight- 
forward to show that the chain operator for any nonunitary history in such a family 
is zero, and that the chain operators for unitary histories with different initial states 
(belonging to the same decomposition of the identity) are orthogonal to one an- 
other. Thus the consistency conditions are satisfied. If the initial condition assigns 
probability 1 to a particular initial state, the corresponding unitary history occurs 
with probability 1, and zero probability is assigned to every other history in the 
family. 

To find an example of an inconsistent family, one must look at histories defined 
at three or more times. Here is a fairly simple example for a spin-half particle. The 
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five history projectors 

Y° = \z IO I O I, 

Y 1 = [z+] © [x+] © [z + ], 

Y 2 = [z + ] © [x + ] O [z _ ], (10.35) 

T 3 = [z +j © [x -j © [z +], 

Y 4 = [z + ] O [x~] O [z“] 

defined at the three times to < h < t 2 form a decomposition of the history identity, 
and thus a sample space of histories. However, for trivial dynamics, T = 7, the 
family is inconsistent. To show this it suffices to compute the chain operators using 
(10.10). In particular, 

7T(7 1 ) = |(z + |x + )| 2 .|z + )(z + |, 

K(Y 3 )= l(z + lx~)l 2 -lz + )(z + l 

are not orthogonal, since | (z + |x + ) | 2 and | (z + \x~) | 2 are both equal to 1 /2; indeed, 

(KiY 1 ), K(Y 3 )) = 1/4. (10.37) 

Similarly, K(Y 2 ) and K(Y 4 ) are not orthogonal, whereas K(Y l ) is orthogonal 
to K(Y 2 ), and K(Y 3 ) to K(Y 4 ). In addition, K(Y°) is orthogonal to the chain 
operators of the other histories. Since consistency requires that all pairs of chain 
operators for distinct histories in the sample space be orthogonal, this is not a 
consistent family. 

On the other hand, the same five histories in (10.35) can form a consistent family 
if one uses a suitable dynamics. Suppose that there is a magnetic field along the y 
axis, and the time intervals t\ — to and t 2 — t\ are chosen in such a way that 


T(tuto) = T(t 2 ,ti) = R, 

where R is the unitary operator such that 

R\z + ) = |x+), 7?|z - } = |*->, 

/?ix + ) = m, r\x~) = -\z + ), 


(10.38) 


(10.39) 


where the second line is a consequence of the first when one uses the definitions 
in (4.14). With this dynamics, Y 2 is a unitary history whose chain operator is 
orthogonal to that of T°, because of the orthogonal initial states, while the chain 
operators for Y l , 7 3 , and Y 4 vanish. Thus the consistency conditions are satisfied. 
That the family (10.35) is consistent for one choice of dynamics and inconsistent 
for another serves to emphasize the important fact, noted earlier, that consistency 
depends upon the dynamical law of time evolution. This is not surprising given 
that the probabilities assigned to histories depend upon the dynamical law. 
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A number of additional examples of consistent and inconsistent histories will 
be discussed in Chs. 12 and 13. However, checking consistency by the process 
of finding chain operators for every history in a sample space is rather tedious and 
inefficient. Some general principles and various tricks explained in the next chapter 
make this process a lot easier. However, the reader may prefer to move on to the 
examples, and only refer back to Ch. 1 1 as needed. 


10.4 Refinement and compatibility 

The refinement of a sample space of histories was discussed in Sec. 8.6. In essence, 
the idea is the same as for any other quantum sample space: some or perhaps all of 
the projectors in a decomposition of the identity are replaced by two or more finer 
projectors whose sum is the coarser projector. It is important to note that even if the 
coarser family one starts with is consistent, the finer family need not be consistent. 
Suppose that Z — { } is a consistent sample space of histories, y = {Y a } is a 


refinement of Z, and that 


z 1 - y 1 + y 2 . 

(10.40) 

Then, by linearity, 


K(Z l ) = K(Y l ) + K(Y 2 ). 

(10.41) 


When a vector is written as a sum of two other vectors, the latter need not be 
perpendicular to each other, and, by analogy, there is no reason to suppose that the 
terms on the right side of (10.3) are mutually orthogonal, (K (T 1 ), K(Y 2 )) — 0, as 
is necessary if y is to be a consistent family. Another way in which y may fail to 
be consistent is the following. Since y is a refinement of Z, any projector in the 
sample space Z, for example Z 3 , belongs to the Boolean algebra generated by y. 
Because they represent mutually exclusive events, Z 3 Z' = 0, and because T 1 and 
Y 1 in (10.40) are projectors, this means that 

Z 3 Y l = 0 = Z 3 T 2 . (10.42) 

In addition, the consistency of Z implies that 

(. K(Z 3 ), K(Z 1 )) = (K(Z 3 ), K(Y 1 )) + {K(Z 3 ), K(Y 2 )) - 0. (10.43) 

However, this does not mean that either (K(Z 3 ), K(Y')) or (K(Z 3 ), K(Y 2 )) is 
zero, whereas (10.21) implies, given (10.42), that both must vanish in order for y 
to be consistent. 

An example which illustrates these principles is the family y whose sample 
space is (10.35), regarded as a refinement of the coarser family Z whose sample 
space consists of the three projectors T°, T 1 + T 2 , and T 3 + Y 4 . As the histories in 
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Z depend (effectively) on only two times, to and t \ , the consistency of this family is 
a consequence of the general argument for the first example in Sec. 10.3. However, 
the family y is inconsistent for T(t',t ) = I. 

In light of these considerations, we shall say that two or more consistent families 
are compatible if and only if they have a common refinement which is itself a 
consistent family. In order for two consistent families y and Z to be compatible, 
two conditions, taken together, are necessary and sufficient. First, the projectors 
for the two sample spaces, or decompositions of the history identity, {7“} and 
{Z^} must commute with each other: 

Y a Z p = Z p Y a for all a, fi. (10.44) 

Second, the chain operators associated with distinct projectors of the form Y a Z P 
must be mutually orthogonal: 

(. K(Y a Z fl ), K(Y & ZP)) = 0 if a jL a or p ± fi. (10.45) 

Note that (10.45) is automatically satisfied when Y a Z p — 0, so one only needs 
to check this condition for nonzero products. Similar considerations apply in an 
obvious way to three or more families. Consistent families that are not compatible 
are said to be (mutually) incompatible. 
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11.1 Introduction 

The conditions which define a consistent family of histories were stated in Ch. 10. 
The sample space must consist of a collection of mutually orthogonal projectors 
that add up to the history identity, and the chain operators for different members of 
the sample space must be mutually orthogonal, (10.20). Checking these conditions 
is in principle straightforward. In practice it can be rather tedious. Thus if there are 
n histories in the sample space, checking orthogonality involves computing n chain 
operators and then taking n(n — l)/2 operator inner products to check that they are 
mutually orthogonal. There are a number of simple observations, some definitions, 
and several “tricks” which can simplify the task of constructing a sample space of a 
consistent family, or checking that a given sample space is consistent. These form 
the subject matter of the present chapter. It is probably not worthwhile trying to 
read through this chapter as a unit. The reader will find it easier to learn the tricks 
by working through examples in Ch. 12 and later chapters, and referring back to 
this chapter as needed. 

The discussion is limited to families in which all the histories in the sample space 
are of the product form, that is, represented by a projector on the history space 
which is a tensor product of quantum properties at different times, as in (8.7). As 
in the remainder of this book, the “strong” consistency conditions (10.20) are used 
rather than the weaker (10.25). 


11.2 Support of a consistent family 

A sample space of histories and the corresponding Boolean algebra it generates 
will be called complete if the sum of the projectors for the different histories in 
the sample space is the identity operator I on the history Hilbert space, (8.23). As 
noted at the end of Sec. 10.1, it is possible for the chain operator K(Y ) to be zero 
even if the history projector Y is not zero. The weight W(Y) = (K(Y), K(Y)) of 
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such a history is obviously 0, so the history is dynamically impossible. Conversely, 
if W(Y ) = 0, then K (Y) = 0; see the discussion in connection with (10.13). The 
support of a consistent family of histories is defined to be the set of all the histories 
in the sample space whose weight is strictly positive, that is, whose chain operators 
do not vanish. In other words, the support is what remains in the sample space if 
the histories of zero weight are removed. In general the support of a family is not 
complete, as that term was defined above, but one can say that it is dynamically 
complete. 

When checking consistency, only histories lying in the support need be consid- 
ered, because a chain operator which is zero is (trivially) orthogonal to all other 
chain operators. Using this fact can simplify the task of checking consistency in 
certain cases, such as the families considered in Ch. 12. Zero-weight histories are 
nonetheless of some importance, for they help to determine which histories, in- 
cluding histories of finite weight, are included in the Boolean event algebra. See 
the comments in Sec. 11.5. 


11.3 Initial and final projectors 

Checking consistency is often simplified by paying attention to the initial and final 
projectors of the histories in the sample space. Thus suppose that two histories 


Y = F 0 © Fi O • • • F f , 
r=F'OF[<t---F' f 


( 11 . 1 ) 


are defined for the same set of times to < t\ < » • • tf. If either Fq Fq = 0 or 
F fF’j = 0, then one can easily show, by writing out the corresponding trace and 
cycling operators around the trace, that (K(Y) , K (Y')) = 0. Consequently, one can 
sometimes tell by inspection that two chain operators will be orthogonal, without 
actually computing what they are. 

If the sample space consists of histories with just two times to < h, then the 
family is automatically consistent. The reason is that the product of the history 
projectors for two different histories in the sample space is 0 (as the sample space 
consists of mutually exclusive possibilities). But in order that 


(*o © *i) • (^o © F [) - F o F o © f iF[ (1 1-2) 


be 0, it is necessary that either FoFq or F\F[ vanish. As we have just seen, either 
possibility implies that the chain operators for the two histories are orthogonal. As 
this holds for any pair of histories in the sample space, the consistency conditions 
are satisfied. 

For families of histories involving three or more times, looking at the initial 
and final projectors does not settle the problem of consistency, but it does make 
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checking consistency somewhat simpler. Suppose, for example, we are considering 
a family of histories based upon a fixed initial state (see Sec. 1 1 .5), with two 
possible projectors at the final time based upon the decomposition 

tf. 1 = P + P. (11.3) 

Then the sample space will consist of various histories, some of whose projectors 
will have P at the final time, and some P. The chain operator of a projector with 
a final P will be orthogonal to one with a final P. Thus we only need to check 
whether the chain operators for the histories ending in P are mutually orthogonal 
among themselves, and, similarly, the mutual orthogonality of the chain operators 
for histories ending in P. If the decomposition of the identity at the final time tf 
involves more than two projectors, one need only check the orthogonality of chain 
operators for histories which end in the same projector, as it is automatic when the 
final projectors are different. 

Yet another way of reducing the work involved in checking consistency can also 
be illustrated using (11.3). Suppose that at f/_i there is a decomposition of the 
identity of the form 

tf-x: I = Y,Qm, (11.4) 


and suppose that we have already checked that the chain operators for the different 
histories ending in P are all mutually orthogonal. In that case we can be sure that 
the chain operators for two histories with projectors 


Y = vT 0 O • • • Qm O P, 
Y' = O • • • Q m ' O P 


(11.5) 


ending in P will also be orthogonal to each other, provided m' ^ m. The reason 
is that by cycling operators around the trace in a suitable fashion one obtains an 
expression for the inner product of the chain operators in the form 

(K(Y), K(Y')) = Tr (• • • Q m T(t f - U t f )PT(t f , t f -x)Qj) 

= Tr (• • • Q m Q m ’) ~ Tr ( • • • Q m T(t f - U t f )PT(t fi t f -x)Qj) , (11.6) 

where • • • refers to the same product of operators in each case. The second line of 
the equation is obtained from the first by replacing P by (7 — P), using the linearity 
of the trace, and noting that T{tf-\,tf)T{tf,tf-\) is the identity operator, see 
(7.40). The trace of the product which contains Q m Q m > vanishes, because m' / m 
means that Q m Q m ’ — 0. The final trace vanishes because it is the inner product of 
the chain operators for the histories obtained from Y and Y’ in (1 1.5) by replacing 
P at the final position with P; by assumption, the orthogonality of these has already 
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been checked. Thus the right side of (11.6) vanishes, so the chain operators for Y 
and Y' are orthogonal. 

If the decomposition of the identity at tf is into n > 2 projectors, the trick just 
discussed can still be used; however, it is necessary to check the mutual orthog- 
onality of the chain operators for histories corresponding to each of n — 1 final 
projectors before one can obtain a certain number of results for those ending in the 
nth projector “for free”. If, rather than a fixed initial state 'I / o> one is interested in 
a decomposition of the identity at to involving several projectors, there is an anal- 
ogous trick in which the projectors at t\ play the role of the Q m in the preceding 
discussion. 


11.4 Heisenberg representation 

It is sometimes convenient to use the Heisenberg representation for the projectors 
and the chain operators, in place of the ordinary or Schrodinger representation 
which we have been using up to now. Suppose Fj is a projector representing an 
event thought of as happening at time tj . We define the corresponding Heisenberg 
projector Fj using the formula 

F j = T(t r ,t j )F j T(tj,tr), (11.7) 

where the reference time t r is arbitrary, but must be kept fixed while analyzing a 
given family of histories. In particular, t r cannot depend upon j. One can, for 
example, use t r — to, but there are other possibilities as well. Given a history 

Y = F 0 QF l Q---F f (11.8) 

of events at the times to < t\ < ■ ■ - tf, the Heisenberg chain operator is defined 
by: 

K(Y) = FfFf-] • • • F 0 = T(t r , t f )K(Y)T(t 0 , t r ), (11.9) 

where the second equality is easily verified using the definition of K(Y) in (10.5) 
along with (1 1.7). Note that K (7), like K (Y), is a linear function of its argument. 

Now let Y' be a history similar to T, except that each Fj in (1 1.8) is replaced by 
an event Fj (which may or may not be the same as Fj ). Then it is easy to show that 

(K{Y'), K(Y )) = (K(Y% K(Y)) = Tr (>'Fj • • • F' f F f F f _i • • • F 0 ) . (11.10) 

(Note that the inner product of the Heisenberg chain operators does not depend 
upon the choice of the reference time t r .) Thus one obtains quite simple expres- 
sions for weights of histories and inner products of chain operators by using the 
Heisenberg representation. While this is not necessarily an advantage when doing 
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an explicit calculation — time dependence has disappeared from (11.9), but one 
still has to use it to calculate the Fj in terms of the Fj, (1 1.7) — it does make some 
of the formulas simpler, and therefore more transparent. One disadvantage of using 
Heisenberg projectors is that, unlike ordinary (Schrodinger) projectors, they do not 
have a direct physical interpretation: what they signify in physical terms depends 
both on the form of the operator and on the dynamics of the quantum system. 


11.5 Fixed initial state 

A family of histories for the times to < h <•■ - tf based on an initial state 'I'o was 
introduced in Sec. 8.5, see (8.30). Let us write the elements of the sample space in 
the form 


y“ = q/ 0 OX“, (11.11) 

where for each a, X a is a projector on the space 77 of histories at times t\ < t 2 < 
■ ■ ■ < tf, with identity /, and 

^X“ = 7, (11.12) 

so that 

^7“ = q/ 0 O/. (11.13) 

The index a may have many components, as in the case of the product of sample 
spaces considered in Sec. 8.5. Since the Y a do not add up to I, we complete the 
sample space by adding another history 

Z = (/ - 'I'o) O /, (11.14) 


as in (8.31). 

The chain operator K (Z) is automatically orthogonal to the chain operators of all 
of the histories of the form (11.11) because the initial projectors are orthogonal, see 
Sec. 1 1.3. Consequently, the necessary and sufficient condition that the consistency 
conditions are satisfied for this sample space is that 

{K(^ 0 QX a ), KWoQXP)) =0fora#£. (11.15) 

As one normally assigns 'To probability 1 and 'I'o probability 0, the history Z can 
be ignored, and we shall henceforth assume that our sample space consists of the 
histories of the form (11.11). 

One consequence of (1 1 . 12) and the fact that the chain operator K (Y) is a linear 
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function of Y, (10.7), is that 

J2K(Y a ) = K(V 0 Qf) = T(t f ,tom. (11.16) 

Of course, this is still true if we omit all the zero terms from the sum on the left 
side, that is to say, if we sum only over histories in the support S of the sample 
space (as defined in Sec. 11.2): 

^K(Y a ) = .STOPo©/) = T(t f ,to)Vo. (11.17) 

aeS 

One can sometimes make use of the result (1 1. 17) in the following way. Suppose 
that we have found a certain collection S of histories of the form (11.11), repre- 
sented by mutually orthogonal history projectors (that is, X a X p = 0 if a ^ fi) 
with nonzero weights. Suppose that, in addition, (1 1.17) is satisfied, but the X a for 
a e S do not add up to /, (11.12). Can we be sure of finding a set of zero-weight 
histories of the form (1 1.1 1) so that we can complete our sample space in the sense 
that (11.13) is satisfied? Generally there are several ways of completing a sample 
space with histories of zero weight. One way is to define 

X' = I -J2 xa , Y' = ^ 0 QX'. (11.18) 

aeS 

Then, since 

r + ^r = ^ 0 o/, (ii.i9) 

a€S 

it follows from the linearity of K, see (11.17), that 

K(Y') = 0. (11.20) 

Consequently, Y’ as defined in (1 1.18) is a zero-weight history of the correct type, 
showing that there is at least one solution to our problem. 

However, Y' might not be the sort of solution we are looking for. The point is 
that while zero-weight histories never occur, and thus in some sense they can be 
ignored, nonetheless they help to determine what constitutes the Boolean algebra 
of histories, since this depends upon the sample space. Sometimes one wants to 
discuss a particular item in the Boolean algebra which occurs with finite probabil- 
ity, but whose very presence in the algebra depends upon the existence of certain 
zero- weight histories in the sample space. In such a case one might need to use a 
collection of zero-weight history projectors adding up to Y' rather than Y' by itself. 

The argument which begins at (1 1 . 16) looks a bit simpler if one uses the Heisen- 
berg representation for the projectors and the chain operators. In particular, since 

K(V o O /) = 'I'o, 


( 11 . 21 ) 
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we can write (11.17) as 

Y,K(Y a ) = $<>, (11.22) 

aeS 

and since K(Y) is, like K (Y), a linear function of its argument Y, the argument 
leading to K(Y') — 0, obviously equivalent to K (Y') — 0, is somewhat more 
transparent. 


11.6 Initial pure state. Chain kets 

If the initial projector of Sec. 1 1.5 projects onto a pure state, 

= [#ol = IV'oXV'ol, (11-23) 

where we will assume that | i/^o) is normalized, there is an alternative route for cal- 
culating weights and checking consistency which involves using chain kets rather 
than chain operators. Since it is usually easier to manipulate kets than it is to carry 
out the corresponding tasks on operators, using chain kets has advantages in terms 
of both speed and simplicity. Suppose that Y a in (11.11) has the form given in 
(8.30), 

Y a = [0 O ] © P? © P“ 2 © • • • P a / , (11 .24) 

with projectors at t\, t 2 , etc. drawn from decompositions of the identity of the type 
(8.25). Then it is easy to see that the corresponding chain operator is of the form 

K(Y a ) = \a){ir 0 \, (11.25) 

where the chain ket |a) is given by the expression 

|a> = P a /T(t f , tf—i) ■ ■ ■ P% 2 T(t 2 , h)P“ l T(h, toMo). (11.26) 

That is, start with \x/ro), integrate Schrodinger’s equation from to to t\, and apply 
the projector P“' to the result in order to obtain 

100 = P“ l T(t u to)\iro). (11.27) 

Next use |0O as the starting state, integrate Schrodinger’s equation from t\ to h, 
and apply P“ 2 . Continuing in this way will eventually yield |a), where the symbol 
a stands for (ai, 012 , ... a/). 

The inner product of two chain operators of the form (11.22) is the same as the 
inner product of the corresponding chain kets: 

(. K(Y a ),K(Y p )) = Tr (Y a )K(Y^)) 

= Tr(|0o>(a|£><0ol) = <a|/O. 


(11.28) 
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Consequently, the consistency condition becomes 

(a|j8) =0fora # (11.29) 

while the weight of a history is 

W(Y a ) = { a\a). (11.30) 

In the special case in which one of the projectors at time tf projects onto a pure 
state |ay), the chain ket will be a complex constant, which could be 0, times \ctf). 
If two or more histories in the sample space have the same final projector onto a 
pure state |a/), then consistency requires that at most one of these chain kets can 
be nonzero. 

The analog of the argument in Sec. 11.5 following (11.17) leads to the following 
conclusion. Suppose one has a collection S of nonzero chain kets of the form 
(1 1.26) with the property that 

J2\ <*) = T (t f ,to)\1r 0 ). (11.31) 

ae S 

That is, they add up to the state produced by the unitary time evolution of |^o) 
from to to tf. Suppose also that for the collection S the consistency conditions 
(11.29) are satisfied. Then one knows that the collection of histories {Y a : a € S} 
is the support of a consistent family: there is at least one way (and usually there 
are many different ways) to add histories of zero weight to the support S in order 
to have a sample space satisfying (11.13), with fi'o = I ^Ao I ■ Nonetheless, for the 
reasons discussed towards the end of Sec. 11.5, it is sometimes a good idea to go 
ahead and construct the zero-weight histories explicitly, in order to have a Boolean 
algebra of history projectors with certain specific properties, rather than relying on 
a general existence proof. 


11.7 Unitary extensions 

For the following discussion it is convenient to use the Heisenberg representa- 
tion introduced in Sec. 1 1 .4, even though the concept of unitary extensions works 
equally well for the ordinary (Schrodinger) representation. Unitary histories were 
introduced in Sec. 8.7 and defined by (8.38). An equivalent definition is that the 
corresponding Heisenberg operators be identical, 

Fo — F] — ■ ■ ■ F f , (11.32) 

where we have used to as the initial time rather than t\ as in Sec. 8.7. It is obvi- 
ous from (11.9) that the Heisenberg chain operator K for a unitary history is the 
projector Fq. 
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Next suppose that in place of (11.32) we have 

Fo = A = • • • Fm- 1 # F,n = A,+i = ■ ■ • (11.33) 

where m is some integer in the interval 1 < m < f. We shall call this a “one-jump 
history”, because the Heisenberg projectors are not all equal; there is a change, or 
“jump” between t m -\ and t m . In a one-jump history there are precisely two types of 
Heisenberg projectors, with all the projectors of one type occurring at times which 
are earlier than the first occurrence of a projector of the other type. The chain 
operator for a history with one jump is K = F/Fq. (If, as is usually the case, Fo 
and Ff do not commute, K is not a projector.) Similarly, a history with two jumps 
is characterized by 

Fo — --- F m - 1 ##„ = ■■• F m '— i + F m , = --- F f , (1 1.34) 

with m and m' two integers in the range 1 < m < m' < f, and its chain operator 
is K — FfF m Fo- (It could be the case that Ff = Fo-) Histories with three or more 
jumps are defined in a similar way. 

A unitary extension of a unitary history (11.32) is obtained by adding some ad- 
ditional times, which may be earlier than to or between to and t f or later than tf, the 
only restriction is that the new times do not appear in the original list to, h, . . . tf. 
At each new time the projector for the event is chosen so that the corresponding 
Heisenberg projector is identical to those in the original history, (11.32). Hence, a 
unitary extension of a unitary history is itself a unitary history, and its Heisenberg 
chain operator is Fo, the same as for the original history. 

A unitary extension of a history with one jump is obtained by including addi- 
tional times, and requiring that the corresponding Heisenberg projectors are such 
that the new history has one jump. This means that if a new time t' precedes t m -\ 
in (1 1.33), the corresponding Heisenberg projector F' will be F 0 , whereas if it fol- 
lows t m , F' will be F m . If additional times are introduced between t m -\ and t m , then 
the Heisenberg projectors corresponding to these times must all be Fo, or all F m , or 
if some are Fo and some are F m , then all the times associated with the former must 
precede the earliest time associated with the latter. The Heisenberg chain operator 
of the extension is the same as for the original history, FfFo- 

Unitary extensions of histories with two or more jumps follow the same pattern. 
One or more additional times are introduced, and the corresponding Heisenberg 
projectors must be such that the number of jumps in the new history is the same 
as in the original history. As a consequence, the Heisenberg chain operator is left 
unchanged. By using a limiting process it is possible to produce a unitary extension 
of a history in which events are defined on a continuous time interval. However, it 
is not clear that there is any advantage to doing so. 
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The fact that the Heisenberg chain operator is not altered in forming a unitary 
extension means that the weight W of an extended history is the same as that of 
the original history. Likewise, if the chain operators for a collection of histories 
are mutually orthogonal, the same is true for the chain operators of the unitary 
extensions. These results can be used to extend a consistent family of histories to 
include additional times without having to recheck the consistency conditions or 
recalculate the weights. 

There is a slight complication in that while the histories obtained by unitary 
extension of the histories in the original sample space form the support of the new 
sample space, one needs additional zero-weight histories so that the projectors will 
add up to the history identity (or the projector for an initial state). The argument 
which follows shows that such zero-weight histories will always exist. Imagine 
that some history is being extended in steps, adding one more time at each step. 
Suppose that t' has just been added to the set of times, with F' the corresponding 
Heisenberg projector. We now define a zero-weight history which has the same 
set of times as the newly extended history, and the same projectors at these times, 
except that at t' the projector F' is replaced with its complement 

F" — I — F' . (11.35) 

What is K" for the history containing F"1 Since the unitary extension had the 
same number of jumps as the original history, F" must occur next to an F' in the 
product which defines K ", and this means that K" = 0, since F'F" = 0. Thus 
we have produced a zero-weight history whose history projector when added to 
that of the newly extended history yields the projector for the history before the 
extension, since F' + F" — I. Consequently, by carrying out unitary extensions 
in successive steps, at each step we generate zero-weight histories of the form 
needed to produce a final sample space in which all the history projectors add up 
to the desired answer. While the procedure just described can always be applied 
to generate a sample space, there will usually be other ways to add zero-weight 
histories, and since the choice of zero-weight histories can determine what events 
occur in the final Boolean algebra, as noted towards the end of Sec. 1 1.5, one may 
prefer to use some alternative to the procedure just described. 


11.8 Intrinsically inconsistent histories 

A single history is said to be intrinsically inconsistent, or simply inconsistent, if 
there is no consistent family which contains it as one of the elements of the Boolean 
algebra. The smallest Boolean algebra of histories which contains a history pro- 
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jector Y consists of 0, Y , Y — I — Y, and the history identity I. Since YY — 0, 

(K(Y), K(Y)) ± 0, (11.36) 

see (10.21), is a necessary and sufficient condition that Y be intrinsically inconsis- 
tent. 

If one restricts attention to histories which are product projectors, (8.6), no his- 
tory involving just two times can be intrinsically inconsistent, so the simplest pos- 
sibility is a three-time history of the form 

Y — A Q B Q C. (11.37) 

Given Y, define the three histories 

7' — AQ B Q C, 

Y" = A Q I Q C , (11.38) 

Y"' = A O / O /, 

where, as usual, P stands for I — P. Then it is evident that 

Y + Y' + Y” + Y'" — IQIQI — I, (11.39) 

so that 

Y = Y' + Y" + Y"\ (11.40) 

and thus 

K(Y) = K(Y') + K(Y") + K(Y'"). (11.41) 

By considering initial and final projectors, Sec. 11.3, it is at once evident that 
K(Y ") and K(Y"') are orthogonal to K(Y). Consequently, 

(K(Y), K(Y )) = (K(Y), K(Y')), (11.42) 

so that Y is an inconsistent history if the right side of this equation is nonzero. 

As an example, consider the histories in (10.35), and let Y — Y 1 . Then Y' — Y 3 , 
and (10.37), which was used to show that (10.35) is an inconsistent family, also 
shows that Y 1 is intrinsically inconsistent. The same is true of Y 2 , Y 3 , and Y 4 . The 
same basic strategy can be applied in certain cases which are at first sight more 
complicated; e.g., the histories in (13.19). 
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12.1 Toy beam splitter 

Beam splitters are employed in optics, in devices such as the Michelson and Mach- 
Zehnder interferometers, to split an incoming beam of light into two separate 
beams propagating perpendicularly to each other. The analogous situation in a 
neutron interferometer is achieved using a single crystal of silicon as a beam split- 
ter. The toy beam splitter in Fig. 12.1 can be thought of as a model of either an 
optical or a neutron beam splitter. It has two entrance channels (or ports) a and b, 
and two exit channels c and d. The sites are labeled by a pair mz, where m is an 
integer, and z is one of the four letters a, b, c, or d, indicating the channel in which 
the site is located. 

The unitary time development operator is T — S b , where the action of the oper- 
ator S b is given by 

S b \mz) — \{m + l)z), (12.1) 


with the exceptions: 

S fe |0a) = (+|lc> + |k/»/V2, 

S h \0b) = (-\\c) + \\d))/V2. 

The physical significance of the states |0 a), |lc), etc., is not altered if they are 
multiplied by arbitrary phase factors, see Sec. 2.2, and this means that (12.2) is not 
the only possible way of representing the action of the beam splitter. One could 
equally well replace the states on the right side with 

{i | lc) + | ld))/V2, (| lc> + i | \d))/y/ 2 , (12.3) 

or make other choices for the phases. There are two other exceptions to (12.1) that 
are needed to supply the model with periodic boundary conditions which connect 
the c channel back into the a channel and the d channel back into the b channel (or 


159 



160 


Examples of consistent families 


3c J 



Fig. 12.1. Toy beam splitter. 


c into b and d into a if one prefers). It is not necessary to write down a formula, 
since we shall only be interested in short time intervals during which the particle 
will not pass across the periodic boundaries and come back to the beam splitter. 
That Si, is unitary follows from the fact that it maps an orthonormal basis of the 
Hilbert space, namely the collection of all kets of the form | mz), onto another 
orthonormal basis of the same space; see Sec. 7.2. 

Suppose that at t = 0 the particle starts off in the state 

m = I0c>, (12.4) 

that is, it is in the a channel and about to enter the beam splitter. Unitary time 
development up to a time t > 0 results in 

m - W o> = (| tc) + M»/V 2 = | ta), (12.5) 


where 


| ma) (| me) + \md))/V2, \mb) (— | me) + \md))/V2 (12.6) 

are the states resulting from unitary time evolution when the particle starts off in 
\()a) or |0 b), respectively. 

Let us consider histories involving just two times, with an initial state | fo) — 
|0a) at t — 0, and a basis at some time t > 0 consisting of the states {| mz)}, z — a, 
b, c, or d, corresponding to a decomposition of the identity 

1 = Yllmzl 


(12.7) 
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By treating | \jr t ) as a pre-probability, see Sec. 9.4, one finds that 

Prtfmcfi) = (l/2)S fm = Pr ([md] t ), (12.8) 

while all other probabilities vanish; that is, at time t the particle will be either in the 
c output channel at the site tc, or in the d channel at td. Here [me] is a projector 
onto the ray which contains |mc), and the subscript indicates the time at which the 
event occurs. 

If, on the other hand, one employs a unitary history, Sec. 8.7, in which at time 
t the particle is in the state | td), one cannot say that it is in either the c or the d 
channel. The situation is analogous to the case of a spin-half particle with an initial 
state |z+) and trivial dynamics, discussed in Sec. 9.3. In a unitary history with 
S z = +1/2 at a later time it is not meaningful to ascribe a value to S x , whereas by 
using a sample space in which S x at the later time makes sense, one concludes that 
S x = +1/2 or S x = —1/2, each with probability 1 /2. 

The toy beam splitter is a bit more complicated than a spin-half particle, because 
when we say that “the particle is in the c channel”, we are not committed to saying 
that it is at a particular site in the c channel. Instead, being in the c channel or 
being in the d channel is represented by means of projectors 

C - ^ |mc)(mc| = y^[mc], D = md ]. (12.9) 

Neither of these projectors commutes with a projector [md] corresponding to the 
state | md) defined in (12.6), so if we use a unitary history, we cannot say that the 
particle is in channel c or channel d. Note that whenever it is sensible to speak of 
a particle being in channel c or channel d, it cannot possibly be in both channels, 
since 


CD = 0; (12.10) 

that is, these properties are mutually exclusive. A quantum particle can lack a 
definite location, as in the state | md), but, as already pointed out in Sec. 4.5, it 
cannot be in two places at the same time. 

The fact that the particle is at the site tc with probability 1 /2 and at the site td 
with probability 1/2 at a time t > 0, (12.8), might suggest that with probability 
1/2 the particle is moving out the c channel through a succession of sites lc, 2c, 
3c, and so forth, and with probability 1 /2 out the d channel through Id, 2d, etc. 
But this is not something one can infer by considering histories defined at only two 
times, for it would be equally consistent to suppose that the particle hops from 2c 
to 3d during the time step from t — 2 to t = 3, and from 2d to 3c if it happens 
to be in the d channel at t — 2. In order to rule out unphysical possibilities of this 
sort we need to consider histories involving more than just two times. 
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Consider a family of histories based upon the initial state [ Oa ] and at each time 
t > 0 the decomposition of the identity (12.7), so that the particle has a definite 
location. The histories are then of the form, for a set of times t — 0, 1, 2, . . . /, 

Y = [0a] © [mz] O [mV] ©••• [mV], (12.11) 

with a chain operator of the form K ( Y ) = \<p) (0a|, Sec. 1 1.6, where the chain ket 
is 


I (f>) = \m"z ") ■ ■ ■ (m'z'\S b \mz)(mz\S b \0a). (12.12) 

From (12.2) it is obvious that the term (mz\S b \0a) is 0 unless m — 1 and z — c 
or d, and given m = 1, it follows from (12.1) that (m'z'\S b \mz) vanishes unless 
m ' = 2 and z! — z. By continuing this argument one sees that | <p), and therefore 
K (7), will vanish for all but two histories, which in the case / = 4 are 

7- = [0a] O [lc] © [2c] © [3c] © [4c], 

Y d = [0a] © [Id] © [2d] © [3d] © [Ad]. ’ 

The fact that the final projectors [4c] and [Ad] in (12.13) are orthogonal to each 
other means that the chain operators K ( Y c ) and K (Y d ) are orthogonal, in accor- 
dance with a general principle noted in Sec. 11.3. Since the chain operators of all 
the other histories are zero, it follows that Y c and Y d form the support, as defined in 
Sec. 11.2, of a consistent family. It is straightforward to show, either by means of 

chain kets as discussed in Sec. 11.6 or by a direct use of 17(7) = (K(Y), K(Y)), 

that 

W(Y C ) = 1/2 = W(Y d ), (12.14) 

and hence, assuming an initial state of [0a] with probability 1, the two histories 
Y c and Y d each have probability 1/2, while all other histories in this family have 
probability 0. 

The fact that the only histories with finite probability are Y c and Y d means that 
if the particle arrives at the site lc at t = 1, it continues to move out along the c 
channel, and does not hop to the d channel, and if the particle is at lc? at time t — 1, 
it moves out along the d channel. Thus by using multiple-time histories one can 
eliminate the possibility that the particle hops back and forth between channels c 
and d, something which cannot be excluded by considering only two-time histories, 
as noted earlier. A formal argument confirming what is rather obvious from looking 
at (12.13) can be constructed by calculating the probability 

Pr(D f | [lc]!) = Pr(A A [lchVPrtflch) (12.15) 

that the particle will be in the d channel at some time t > 0, given that it was at the 
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site [lc] at t — 1. Here D, is a projector on the history space for the particle to be 
in channel d at time t. For example, for t — 2, 

D 2 = IQIQDQIQI, (12.16) 


and thus 


D 2 A [1c]i —IQ [lc] Q D Q I Q I. (12.17) 

This projector gives 0 when applied to either Y c or Y d , the only two histories with 
positive probability, and therefore the numerator on the right side of (12.15) is 0. 
Thus if the particle is at lc at t = 1, it will not be in the d channel at t — 2. The 
same argument works equally well for other values of t, and analogous results are 
obtained if the particle is initially in the d channel. Thus one has 

Pr(A | [lch) = 0 = Pr(C f | [ld]0, /1010X 

( 12 . 18 ) 

Pr(C f | [lc]0 = 1 = Pr(A | [ld]i) 

for any t > 1, where C, is defined in the same manner as D,, with C in place of D. 

(Since we are considering a family which is based on the initial state [0a], the 
preceding discussion runs into the technical difficulty that C, and D, do not belong 
to the corresponding Boolean algebra of histories when the latter is constructed in 
the manner indicated in Sec. 8.5. One can get around this problem by replacing 
C t and D, with the operators C, A [aOJo and D, A [aOJo, and remembering that the 
probabilities in (12.15) and (12.18) always contain the initial state [a0] at t = 0 as 
an (implicit) condition. Also see the remarks in Sec. 14.4.) 

Another family of consistent histories can be constructed in the following way. 
At the times t = 1,2 use, in place of (12.7), a three-projector decomposition of the 
identity 

I = [td] + [ tb ] + J t , (12.19) 

where the states | fa), | tb) are defined in (12.6), and 

J, = I- [to] - [tb] = I - [tc] - [ td ] (12.20) 


is a projector for the particle to be someplace other than the two sites tc or td. At 
later times t > 3 use the decomposition (12.7). It is easy to show that in the case 
/ = 4 the two histories 

Y c = [0a] O [Id] O [2d] © [3c] © [4c], 

Y d = [0a] © [Id] O [2d] O [3d] © [4d], 

each with weight 1 /2, form the support of the sample space of a consistent family; 
all other histories have zero weight. 

The histories Y c and Y d in (12.21) have the physical significance that at t = 
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1 and t — 2 the particle is in a coherent superposition of states in both output 
channels. After t — 2 a “split” occurs, and at later times the two histories are no 
longer identical: one represents the particle as traveling out the c channel, and the 
other the particle traveling out the d channel. What causes this split? To think of 
a physical cause for it is to look at the problem in the wrong way. Recall the case 
of a spin-half particle with trivial dynamics, discussed in Sec. 9.3, with S z = 1/2 
initially and then S x — ±1/2 at a later time. There is no physical transformation of 
the particle, since the dynamics is trivial. Instead, different aspects of the particle’s 
spin angular momentum are being described at two successive times. In the same 
way, the histories in (12.21) allow us to describe a property at times t = 1 and 
t — 2, corresponding to the linear superposition |md), which cannot be described 
if we use the histories in (12.13). Conversely, using (12.21) makes it impossible 
to discuss whether the particle is in the c or in the d channel when t — 1 or 2, 
because these properties are incompatible with the projectors employed in Y c and 
Y d . There is a similar split in the case of the histories Y c and Y d : they start with 
the same initial state [Oa], and the split occurs when t changes from 0 to 1. In this 
situation one may be tempted to suppose that the beam splitter causes the split, but 
that surely cannot be the case, for the very same beam splitter does not cause a split 
in the case of Y c and Y d . 

We have one family of histories based upon Y c and Y d , and a distinct family 
based upon Y c and Y d . The two families are incompatible, as they have no common 
refinement. Which one provides the correct description of the physical system? 
Consider two histories of Great Britain: one a political history which discusses the 
monarchs, the other an intellectual history focusing upon developments in British 
science. Which is the correct history of Great Britain? That is not the proper 
way to compare them. Instead, there are certain questions which can be answered 
by one history rather than the other. For certain purposes one history is more 
useful, for other purposes the other is to be preferred. In the same way, both the 
Y c , Y d family and the Y c , Y d family provide correct (stochastic) descriptions of 
the physical system, descriptions which are useful for answering different sorts 
of questions. There are, to be sure, certain questions which can be answered using 
either family, such as “Will the particle be in the c or the d channel at t — 4 if it was 
at 3c at t — 3?” For such questions, both families give precisely the same answer, 
in agreement with a general principle of consistency discussed in Sec. 16.3. 

Next consider a family in which the histories start off like Y c and Y d in (12.21), 
but later on revert back to the coherent superposition states corresponding to 
(12.19); for example 


Y' = [Oa] O [Id] © [2d] O [3c] © [4a], 
Y" = [Oa] © [Id] © [2d] O [3d] © [4a], 


( 12 . 22 ) 
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plus other histories needed to make up a sample space. This family is not consis- 
tent. The reason is that the chain kets |y') and | y") corresponding to K(Y') and 
K(Y”) are nonzero multiples of |4d), so (y'|y") / 0, and hence K (Y') and K(Y") 
are not orthogonal to each other, see (11.28). There is a certain analogy between 
(12.22) and the inconsistent family for a spin-half particle involving three times 
discussed in Sec. 10.3. The precise time at which the split and the rejoining occur 
is not important; for example, the chain operators associated with the histories 

X' = [Qc] O [lc] © [2c] © [3c] © [4a], 

X” = [0a] © [Id] O [2d] O [3d] © [4a] * ‘ ' * 

are also not mutually orthogonal, so the corresponding family is inconsistent. In- 
consistency does not require a perfect rejoining; even a partial one can cause 
trouble! But why might someone want to consider families of histories of the form 
(12.22) or (12.23)? We will see in Ch. 13 that in the case of a simple interferometer 
the analogous histories look rather “natural”, and it will be of some importance that 
they are not part of a consistent family. 


12.2 Beam splitter with detector 

Let us now add a detector of the sort described in Sec. 7.4 to the c output channel 
of the beam splitter, Fig. 12.2. The detector has two states: |0c) “ready”, and |lc) 
“triggered”, which span a Hilbert space C. The Hilbert space of the total quantum 
system is 

H = M®C, (12.24) 

where A4 is the Hilbert space of the particle passing through the beam splitter, and 
the collection [| mz, nc )} for different values of m, z, and n is an orthonormal basis 
of H. 

The unitary time development operator takes the form 

T = S b R c , (12.25) 

where S b is the unitary transformation defined in (12.1) and (12.2), extended in the 
usual way to the operator S b 0 I on M. 0 C, and R c (the subscript indicates that 
this detector is attached to the c channel) is defined in analogy with (7.53) as 

R c \mz,nc) — \mz,nc), (12.26) 

with the exception that 

R c \2c, nc) = 1 2c, (1 - n)c). (12.27) 

That is, R c is the identity operator unless the particle is at the site 2c, in which case 
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Fig. 12.2. Toy beam splitter with detector. 


the detector flips from Oc to lc, or lc to Oc. As noted in Sec. 7.4, such a detector 
does not perturb the motion of the particle, in the sense that the particle moves from 
lc to 2c to 3c, etc. at successive time steps whether or not the detector is present. 
We shall assume an initial state 


|v!/ 0 ) = |0a, Oc) 


(12.28) 


at t = 0: the particle is at 0 a, and is about to enter the beam splitter, and the 
detector is ready. Unitary time development of this initial state leads to 


14/j = r'|* 0 > = 


(|fc> + \tdj) 0 |0c)/V2 
(| tc, lc) + \td, 0c})/V2 


for t — 1,2, 
for t > 3. 


(12.29) 


If one regards |4^ r ) for t > 3 as representing a physical state or physical property 
of the combined particle and detector, then the detector is not in a definite state. 
Instead one has a toy counterpart of a macroscopic quantum superposition (MQS) 
or Schrodinger’s cat state. See the discussion in Sec. 9.6. It is impossible to say 
whether or not the detector has detected something at times t > 3 if one uses a 
unitary family based upon the initial state | v I / o). 

A useful family of histories for studying the process of detection is based on the 
initial state | fl'o) and a decomposition of the identity in pure states 

7=^[mz,nc], (12.30) 


in which the particle has a definite location and the detector is in one of its pointer 
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states at every time t > 0. The histories 

Z c — [0 a, 0c] O [1 c, 0c] O [2c, 0c] O [3c, lc] O [4c, lc] O • • • , 

Z d = [0 a, 0c] o [Id, 0c] o [2c?, 0c] © [3c?, 0c] © [4c?, 0c] © • • • , (12,31) 

continuing for as long a sequence of times as one wants to consider, are the ob- 
vious counterparts of Y c and Y d in (12.13). Because the final projectors are or- 
thogonal, K (Z c ) and K (Z d ) are orthogonal, and it is not hard to show that Z c and 
Z d constitute the support of a consistent family T based on the initial state | 4' 0 ). 
The physical interpretation of these histories is straightforward. In Z c the particle 
moves out the c channel and triggers the detector, changing 0c to lc as it moves 
from 2c to 3c. In Z d the particle moves out the c? channel, and the detector remains 
in its untriggered or ready state 0c. 

We can use the property that the detector has (or has not) detected the particle 
at some time t' > 3 to determine which channel the particle is in, by computing a 
conditional probability. Thus one finds — see the discussion following (12.15) — 
that 


Pr(C r | [lc],) = 1, Pr(A | [lc],) = 0, 

Pr(C ? | [0c],) = 0, Pr(A|[0c],) = 1, 


for t' > 3 and t > 1. That is, if at some time t' > 3, the detector has detected the 
particle, then at time t, the particle is (or was) in the c and not in the d channel, 
while if the detector has not detected the particle, the particle is (or was) in the d 
and not in the c channel. 

Note that the conditional probabilities in (12.32) are valid not simply for t > 3; 
they also hold for t = 1 and 2. That is, if the detector is triggered at time t' = 3, 
then the particle was in the c channel at t — 1 and 2, and if the detector is not 
triggered at t' = 3, then at these earlier times the particle was in the d channel. 
These results are perfectly reasonable from a physical point of view. How could 
the particle have triggered the detector unless it was already moving out along the 
c channel? And if it did not trigger the detector, where could it have been except in 
the d channel? As long as the particle does not hop from one channel to the other 
in some magical way, the results in (12.32) are just what one would expect. 

Another family in which the detector is always in one of its pointer states is the 
counterpart of (12.21), modified by the addition of a detector: 


Z c — [0a, 0c] O [la, 0c] © [2a, 0c] © [3c, lc] © [4c, lc] © • • • , 
Z d = [0a, 0c] O [la, 0c] © [2d, 0c] © [3d, 0c] © [Ad, 0c] © • • • . 


The chain operators for Z c and Z d are orthogonal, and it is easy to find zero-weight 
histories to complete the sample space, so that (12.33) is the support of a consistent 
family Q. It differs from T , (12.31), in that at t = 1 and 2 the particle is in the 



168 


Examples of consistent families 


superposition state \ta) rather than in the c or the d channel, but for times after 
t — 2 T and Q are identical. 

Both families T, (12.31), and Q, (12.33), represent equally good quantum de- 
scriptions. The only difference is that they allow one to discuss somewhat different 
properties of the particle at a time after it has passed through the beam splitter and 
before it has been detected. In particular, if one is interested in knowing the loca- 
tion of the particle before the measurement occurred (or could have occurred), it is 
necessary to employ a consistent family in which questions about its location are 
meaningful, so T must be used, not Q. On the other hand, if one is interested in 
whether the particle was in the superposition |lo) at f = 1 rather than in |1 b) — 
see the definitions in (12.6) — then it is necessary to use Q, for questions related 
to such superpositions are meaningless in T. 

The family Q, (12.33), is useful for understanding the idea, which goes back 
to von Neumann, that a measurement produces a “collapse” or “reduction” of the 
wave function. As applied to our toy model, a measurement which serves to detect 
the presence of the particle in the c channel is thought of as collapsing the super- 
position wave function |2d) produced by unitary time evolution into a state |3c) 
located in the c channel. This is the step from [2d, Oc] to [3c, lc] in the history Z c . 
Similarly, if the detector does not detect the particle, |2d) collapses to a state 1 3c/) 
in the d channel, as represented by the step from t — 2 to t — 3 in the history Z d . 

The approach to measurements based on wave function collapse is the subject 
of Sec. 18.2. While it can often be employed in a way which gives correct results, 
wave function collapse is not really needed, since the same results can always be 
obtained by straightforward use of conditional probabilities. On the other hand, 
it has given rise to a lot of confusion, principally because the collapse tends to 
be thought of as a physical effect produced by the measuring apparatus. With 
reference to our toy model, this might be a reasonable point of view when the 
particle is detected to be in the c channel, but it seems very odd that a failure 
to detect the particle in the c channel has the effect of collapsing its wave func- 
tion into the d channel, which might be a long way away from the c detector. 
That the collapse is not any sort of physical effect is clear from the fact that it 
occurs in the family (12.21) in the absence of a detector, and in T, (12.31), it 
occurs prior to detection. To be sure, in T one might suppose that the collapse 
is caused by the beam splitter. However, one could modify (12.31) in an obvi- 
ous way to produce a consistent family in which the collapse takes place between 
t — 1 and t = 2, and thus has nothing to do with either the beam splitter or 
detector. 

Another way in which the collapse approach to quantum measurements is some- 
what unsatisfactory is that it does not provide a connection between the outcome 
of a measurement and a corresponding property of the measured system before the 
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measurement took place. For example, if at t > 3 the detector is in the state 1 c, 
there is no way to infer that the particle was earlier in the c channel if one uses 
the family (12.33) rather than (12.31). The connection between measurements and 
what they measure will be discussed in Ch. 17. 


12.3 Time-elapse detector 

A simple two- state toy detector is useful for t hinking about a number of situations 
in quantum theory involving detection and measurement. However, it has its lim- 
itations. In particular, unlike real detectors, it does not have sufficient complexity 
to allow the time at which an event occurs to be recorded by the detector. While it 
is certainly possible to include a clock as part of a toy detector, a slightly simpler 
solution to the timing problem is to use a time-elapse detector, when an event is 
detected, a clock is started, and reading this clock tells how much time has elapsed 
since the detection occurred. As in Sec. 7.4, the Hilbert space 7i is a tensor product 
M. of the space M. of the particle, spanned by kets | m) with —M a <m< Mb, 

and the space M of the detector, with kets \ri) labeled by n in the range 

—N <n<N. (12.34) 


In effect, one can think of the detector as a second particle that moves according 
to an appropriate dynamics. However, to avoid confusion the term particle will 
be reserved for the toy particle whose position is labeled by m, and which the 
detector is designed to detect, while n will be the position of the detector’s pointer 
(see the remarks at the end of Sec. 9.5). We shall suppose that M a , Mb, and N 
are sufficiently large that we do not have to worry about either the particle or the 
pointer “coming around the cycle” during the time period of interest. 

The unitary time development operator is 

T = SRS d , (12.35) 

where S is the shift operator on A4, 

S\m) — \m + 1), (12.36) 

with a periodic boundary condition S\Mb) — \—M a ), and S d acts on A/", 

S d \n) = \n + l), (12.37) 


with the exceptions 


S d |0) = |0>, S d \-l) = \l), (12.38) 


and S d \N) — \—N) to take care of the periodic boundary condition. The unitary 



170 


Examples of consistent families 


operator R which couples the pointer to the particle is the identity, 

R\m, n) — \m, n), (12.39) 

except for 

R\2, 0) = |2, 1), R\2, 1) = [2, 0>. (12.40) 

That is, when the particle is at m — 2, R moves the pointer from n = 0 to n = 1, 
or from n = 1 to n = 0, while if the pointer is someplace else, R has no effect on 
it. The unitarity of T in (12.35) follows from that of S, R, and 5</. 

When its pointer is at n — 0, the detector is in its “ready” state, where it remains 
until the particle reaches m — 2, at which point the “detection event” (12.40) 
occurs, and the pointer hops to n — 1 at the same time as the particle hops to m = 3, 
since T includes the shift operator S for the particle, (12.35). This is identical to 
the operation of the two-state detector of Sec. 7.4. But once the detector pointer is 
at n — 1 it keeps going, (12.37), so a typical unitary time development of | m, n) is 
of the form 

|0, 0) h* |1, 0} |2, 0) h* |3, 1) |4, 2> t-> |5, 3) !-►••• . (12.41) 

Thus the pointer reading n (assumed to be less than N ) tells how much time has 
elapsed since the detection event occurred. 

As an example of the operation of this detector in a stochastic context, suppose 
that at t = 0 there is an initial state 


m = Wo>® |0>, (12.42) 

where the particle wave packet 

IV'o) = a|0> +b\\) + c\2) (12.43) 

has three nonzero coefficients a,b,c. Consider histories which for t > 0 employ 
a decomposition of the identity corresponding to the orthonormal basis {| m, n)}. 
The chain operators for the three histories 

Z° = ['To] O [1, 0] O [2, 0] O [3, 1], 

Z 1 = ['To] O [2, 0] O [3, 1] © [4, 2], (12.44) 

Z 2 = ['T 0 ]O[3,l]O[4,2]O[5,3], 

involving the four times t — 0, 1, 2, 3, are obviously orthogonal to one another 
(because of the final projectors, Sec. 11.3). The corresponding weights are \a\ 2 , 
\b\ 2 , and |c| 2 , while all other histories beginning with ['To] have zero weight. Hence 
(12.44) is the support of a consistent family with initial state I'Tq). 
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Suppose that the pointer is located at n — 2 when t — 3. Since the pointer 
position indicates the time that has elapsed since the particle was detected, we 
should be able to infer that the detection event [2, 0] occurred at f = 3 — 2 = 1. 
Indeed, one can show that 

Pr([2, 0] at t = 1 | n = 2 at t = 3) = 1, (12.45) 

using the fact that the condition n = 2 when t = 3 is only true for Z 1 . If the 
pointer is at n = 1 when t — 3, one can use the family (12.44) to show not 
only that the detection event [2, 0] occurred at t = 2, but also that at t = 1 the 
particle was at m = 1, one site to the left of the detector. Being able to infer 
where the particle was before it was detected is intuitively reasonable, and is the 
sort of inference often employed when analyzing data from real detectors in the 
laboratory. Such inferences depend, of course, on using an appropriate consistent 
family, as discussed in Sec. 12.2. 


12.4 Toy alpha decay 

A toy model of alpha decay was introduced in Sec. 7.4, and discussed using the 
Bom rule in Sec. 9.5. We assume the sites are labeled as in Fig. 7.2 on page 106, 
and will employ the same T — S a dynamics used previously, (7.56). That is, 

S a \m) — |m + 1), (12.46) 

with the exceptions 

5a 1 0) = a|0> + /J|l), S a | — 1} = y|0) + <5| 1) , (12.47) 

together with a periodic boundary condition. The coefficients a, /3, y, and S satisfy 
(7.58). 

Consider histories which begin with the initial state 

Wo> = |0>, (12.48) 


the alpha particle inside the nucleus, and employ a decomposition of the identity 
based upon particle position states | m) at all later times. That such a family of 
histories, thought of as extending from the initial state at t — 0 till a later time 
t — /, is consistent can be seen by working out what happens when / is small. In 
particular, when / = 1 , there are two histories with nonzero weight: 


[0] O [0], 
[0] O [1]. 


(12.49) 


The chain operators are orthogonal because the projectors at the final time are 
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mutually orthogonal (Sec. 11.3). With f — 2, there are three histories with nonzero 
weight: 


[0] O [0] © [0], 

[0] O [0] O [1], (12.50) 

[0] O [1] O [2], 

and again it is obvious that the chain operators are orthogonal, so that the corre- 
sponding family is consistent. 

These examples suggest the general pattern, valid for any /. The support of the 
consistent family contains a history in which m = 0 at all times, together with 
histories with a decay time t — r, with r in the range 0 < r < / — 1, of the form 

[0] 0 O [0]i O • • • [0] r O [l] r+1 © [ 2] t+ 2 O • • • . (12.51) 

That is, the alpha particle remains in the nucleus, m = 0, until the time t — r, 
then hops to m = 1 at t = r + 1, and after that it keeps going. If one uses this 
particular family of histories, the quantum problem is much the same as that of a 
classical particle which hops out of a well with a certain probability at each time 
step, and once out of the well moves away from it at a constant speed. This is not 
surprising, since as long as one employs a single consistent family the mathematics 
of a quantum stochastic process is formally identical to that of a classical stochastic 
process. 

In Sec. 9.5 a simple two-state detector was used in analyzing toy alpha decay 
by means of the Bom rule. Additional insight can be gained by replacing the two- 
state detector in Fig. 9.1 with the time-elapse detector of Sec. 12.3 to detect the 
alpha particle as it hops from m — 2 to m — 3 after leaving the nucleus. On the 
Hilbert space A4 0 J\f of the alpha particle and detector pointer, the unitary time 
development operator is 


T = S a RS d , (12.52) 

where S d and R are defined in (12.37)-(12.40). 

Suppose that at the time t = t the detector pointer is at n. Then the detection 
event should have occurred at the time t — h. And since the particle was detected 
at the site m — 2, the actual decay time r when it left the nucleus would have been 
a bit earlier, 


r = t - n 


- 2 , 


(12.53) 


because of the finite travel time from the nucleus to the detector. This line of 
reasoning can be confirmed by a straightforward calculation of the conditional 
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probabilities 

Pr(m —Oatt — t — h — 2\n — h att — t) — l, 

_ _ (12.54) 

Pr(m — 0 att — t — n — \\n — n att — t) — t). 

That is, at the time r given in (12.53), the particle was still in the nucleus, while 
one time step later it was no longer there. (Of course this only makes sense if t and 
h are such that Pr(n — h att — t) is positive.) Note once again that by adopting 
an appropriate family of histories one can make physically reasonable inferences 
about events prior to the detection of the alpha particle. 

Does the fact that we can assign a decay time in the case of our toy model mean 
that the same thing is possible for real alpha decay? The answer is presumably 
“yes”, provided one does not require that the decay time be defined too precisely. 
However, finding a suitable criterion for the nucleus to have or have not decayed 
and checking consistency conditions for an appropriate family pose nontrivial tech- 
nical issues, and the matter does not seem to have been studied in detail. Note that 
even in the toy model the decay time is not precisely defined, because time is dis- 
cretized, and r + 1 has as much justification for being identified with the decay 
time as does r. This uncertainty can, however, be much shorter than the half life 
of the nucleus, which is of the order of \/3\~ 2 . 
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Quantum interference 


13.1 Two-slit and Mach-Zehnder interferometers 

Interference effects involving quantum particles reflect both the wave-like and 
particle-like properties of quantum entities. One of the best-known examples is 
the interference pattern produced by a double slit. Quantum particles — photons 
or neutrons or electrons — are sent one at a time through the slit system shown in 
Fig. 13.1, and later arrive at a series of detectors located in the diffraction zone far 
from the slits. The detectors are triggered at random, with each particle triggering 
just one detector. After enough particles have been detected, an interference pattern 
can be discerned in the relative counting rates of the different detectors, indicated 
by the length of the horizontal bars in the figure. Lots of particles arrive at some 
detectors, very few particles at others. 


o — 
o 
o — 

o — 
o- 
o — 


Fig. 13.1. Interference pattern for a wave arriving from the left and passing through the 
two slits. Each circle on the right side represents a detector, and the black bar to its right 
indicates the relative counting rate. 

The relative number of particles arriving at each detector depends on the differ- 
ence of the distances between the detector and the two slits, in units of the particle’s 
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de Broglie wavelength. Furthermore, this interference pattern persists even at very 
low intensities, say one particle per second passing through the slit system. Hence 
it seems very unlikely that it arises from a sort of cooperative phenomenon in which 
a particle going through one slit compares notes with a particle going through the 
other slit. Instead, each particle must somehow pass through both slits, for how 
else can one understand the interference effect? 



o 


o 

o 

o 


o 

o 

o 


o 


o 


Fig. 13.2. Detectors directly behind the two slits. The black bars are again proportional to 
the counting rates. 

However, if detectors are placed directly behind the two slits, Fig. 13.2, then 
either one or the other detector detects a particle, and it is never the case that both 
detectors simultaneously detect a particle. Furthermore, the total counting rate for 
the arrangement in Fig. 13.2 is the same as that in Fig. 13.1, suggesting that if a 
particle had not been detected just behind one of the slits, it would have continued 
on into the diffraction zone and arrived at one of the detectors located there. Thus it 
seems plausible that the particles which do arrive in the diffraction zone in Fig. 13.1 
have earlier passed through one or the other of the two slits, and not both. But this 
is difficult to reconcile with the interference effect seen in the diffraction zone, 
which seems to require that each particle pass through both slits. Could a particle 
passing through one slit somehow sense the presence of the other slit, and take this 
into account when it arrives in the diffraction zone? 

In Feynman’s discussion of two-slit interference (see bibliography), he considers 
what happens if there is a nondestructive measurement of which slit the particle 
passes through, a measurement that allows the particle to continue on its way and 
later be detected in the diffraction zone. His quantum particles are electrons, and he 
places a light source just behind the slits, Fig. 13.3. By scattering a photon off the 
electron one can “see” which slit it has just passed through. Illuminating the slits 
in this way washes out the interference effect, and the intensities in the diffraction 
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Fig. 13.3. A light source L between the slits washes out the electron interference pattern. 

zone can be explained as sums of intensities due to electrons coming through each 
of the two slits. 

Feynman then imagines reducing the intensity of the light source to such a de- 
gree that sometimes an electron scatters a photon, revealing which slit it passed 
through, and sometimes it does not. Data for electrons arriving in the diffraction 
zone are then segregated into two sets: one set for “visible” electrons which ear- 
lier scattered a photon, and the other for electrons which were “invisible” as they 
passed through the slit system. When the set of data for the “visible” electrons is 
examined it shows no interference effects, whereas that for the “invisible” electrons 
indicates that they arrive in the diffraction zone with the same interference pattern 
as when there is no source of light behind the slits. Can the behavior of an electron 
really depend upon whether or not it has been seen? 

In this chapter we explore these paradoxes using a toy Mach-Zehnder interfer- 
ometer, which exhibits the same sorts of paradoxes as a double slit, but is easier 
to discuss. A Mach-Zehnder interferometer, Fig. 13.4, consists of a beam split- 
ter followed by two mi rrors which bring the split beams back together again, and 
a second beam splitter placed where the reflected beams intersect. Detectors can 
be placed on the output channels. We assume that light from a monochromatic 
source enters the first beam splitter through the a channel. The intensity of light 
emerging in the two output channels e and / depends on the difference in path 
length, measured in units of the wavelength of the light, in the c and d arms of 
the interferometer. (The classical wave theory of light suffices for calculating these 
intensities; one does not need quantum theory.) We shall assume that this differ- 
ence has been adjusted so that after the second beam splitter all the light which 
enters through the a channel emerges in the / channel and none in the e chan- 
nel. Rather than changing the physical path lengths, it is possible to alter the final 
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Fig. 13.4. Mach-Zehnder interferometer with detectors. The beam splitters are labeled B\ 
and So- 


intensities by inserting phase shifters in one or both arms of the interferometer. 
(A phase shifter is a piece of dielectric material which, when placed in the light 
beam, alters the optical path length (number of wavelengths) between the two beam 
splitters.) 

An interferometer for neutrons which is analogous to a Mach-Zehnder interfer- 
ometer for photons can be constructed using a single crystal of silicon. For our 
purposes the difference between these two types of interferometer is not important, 
since neutrons are quantum particles that behave like waves, and photons are light 
waves that behave like particles. Thus while we shall continue to think of pho- 
tons going through a Mach-Zehnder interferometer, the toy model introduced in 
Sec. 13.2 could equally well describe the interference of neutrons. 

The analogy between a Mach-Zehnder interferometer and double-slit interfer- 
ence is the following. Each photon on its way through the interferometer must 
pass through the c arm or the d arm in much the same way that a particle (photon 
or something else) must pass through one of the two slits on its way to a detector 
in the diffraction zone. The first beam splitter provides a source of coherent light 
(that is, the relative phase is well defined) for the two arms of the interferometer, 
just as one needs a coherent source of particles illuminating the two slits. (This 
coherent source can be a single slit a long distance to the left of the double slit.) 
The second beam splitter in the interferometer combines beams from the separate 
arms and makes them interfere in a way which is analogous to the interference of 
the beams emerging from the two slits when they reach the diffraction zone. 
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13.2 Toy Mach-Zehnder interferometer 

We shall set up a stochastic or probabilistic model of a toy Mach-Zehnder inter- 
ferometer, Fig. 13.5, and discuss what happens when a single particle or photon 
passes through the instrument. The model will supply us with probabilities for 
different possible histories of this single particle. If one imagines, as in a real ex- 
periment, lots of particles going through the apparatus, one after another, then each 
particle represents an “independent trial” in the sense of probability theory. That 
is, each particle will follow (or undergo) a particular history chosen randomly from 
the collection of all possible histories. If a large number of particles are used, then 
the number which follow some given history will be proportional to the probabil- 
ity, computed by the laws of quantum theory, that a single particle will follow that 
history. 



Fig. 13.5. Toy Mach-Zehnder interferometer constructed from two beam splitters of the 
sort shown in Fig. 12.1. 

The toy Mach-Zehnder interferometer consists of two toy beam splitters, of the 
type shown in Fig. 12.1, in series. The arms and the entrance and output channels 
are labeled in a way which corresponds to Fig. 13.4. The unitary time transforma- 
tion for the toy model is T — Si, where the operator S, is defined by 

Si\mz) — \(m + \)z) (13.1) 

for m an integer, and z — a, b,c,d,e or f, with the exceptions 

Si\0a) = (+|lc) + \ld})/V2, Si\Ob) = (-|lc) + l| \dj)/Jl, 

Si\3c) = (+|4e> + |4/>)/V2, S ( \3d) = (-\4e) + \4f))/V2. 
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(See the comment following (12.2) on the choice of phases.) In addition, the usual 
provision must be made for periodic boundary conditions, but (as usual) these will 
not play any role in the discussion which follows; see the remarks in Sec. 12.1. The 
transformation S, is unitary because it maps an orthonormal basis, the collection of 
states {| mz)}, onto an orthonormal basis of the Hilbert space. A particle (photon) 
which enters the a channel undergoes a unitary time evolution of the form 

|0a) h* | Id) 1 2d) h* 1 3d) h* |4/) |5/) h* • • • , (13.3) 

where, as in (12.6), 

\ma) — (+|mc) + \md))/s/2, \mb) = (—\mc) + \md))/V2 (13.4) 

are superpositions of states of the particle in the c and d arms of the interferometer, 
with phases chosen to correspond to unitary evolution under S, starting with |0a), 
and |0d), respectively. 

The probability that the particle emerges in the e or in the / channel is influenced 
by what happens in both arms of the interferometer, as can be seen in the following 
way. Let us introduce toy phase shifters in the c and d arms by using in place of 5, 
a unitary time transformation S '■ identical to 5,, (13.1) and (13.2), except that 

S ; 1 2c) = e 1 ^ 1 3c), S'i 1 2d) = e*** |3d), (13.5) 

where </><■ and (pd are phase shifts. Obviously S' is unitary, and it is the same as 5,- 
when (p c and (p ( j are zero. If we use S' { in place of 5,, the unitary time evolution in 
(13.3) becomes 

|0a) |ld) i-> 1 2d) = (|2c) + \2d))/V2 h* (c /0c |3 c) + e ,<t>d \3d})/V2 

±[(c‘^ - e^)|4e) + (c 1 ^ +c^)|4/)] • • • , ' ^ 

where the result at t — 5 is obtained by replacing |4c) by |5c), and |4 /) by |5/). 

Consider a consistent family of histories based upon an initial state |0a) at t = 0 
and a decomposition of the identity corresponding to the orthonormal basis {| mz)} 
at a second time t = 4. There are two histories with positive weight, 

7 = [Oa] 0 O [4c] 4 , Y' = [Oa] 0 O [4/] 4 , (13.7) 

where, as usual, subscripts indicate the time. The probabilities can be read off 
from the t = 4 term in (13.6), treating it as a pre-probability, by taking the absolute 
squares of the coefficients of |4c) and |4 /): 

2 /4 = [sin(A0/2)] 2 , 

2 /4 = [cos(A0/2)] 2 , 


Pr([4c] 4 ) = Pr(T) 
Pr([4/] 4 ) = Pr(T') 


— e 


(13.8) 
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where 


A <)> = (t> c - <!> d (13.9) 

is the difference between the two phase shifts. Since these probabilities depend 
upon A <p, and thus upon what is happening in both arms of the interferometer, 
the quantum particle must in some sense be delocalized as it passes through the 
interferometer, rather than localized in arm c or in arm d. On the other hand, it is a 
mistake to think of it as simultaneously present in both arms in the sense that “it is 
in c and at the same time it is in c/.” See the remarks in Sec. 4.5: a quantum particle 
cannot be in two places at the same time. 

Similarly, if we want to understand double-slit interference using this analogy, 
we would like to say that the particle “goes through both slits,” without meaning 
that it is present in the upper slit at the same time as it is present in the lower slit, 
or that it went through one slit or the other and we do not know which. See the 
discussion of the localization of quantum particles in Secs. 2.3 and 4.5. Speaking of 
the particle as “passing through the slit system” conveys roughly the right meaning. 
In the double-slit experiment, one could introduce phase shifters behind each slit, 
and thereby shift the positions of the peaks and valleys of the interference pattern 
in the diffraction zone. Again, it is the difference of the phase shifts which is 
important, and this shows that one somehow has to think of the quantum particle 
as a coherent entity as it passes through the slit system. 

Very similar results are obtained if instead of |0a) one uses a wave packet 

\ir 0 )=c |-2 a) + c'\-\a) + c"|0a) (13.10) 

in the a channel as the initial state at t — 0, where c, c', and c" are numerical 
coefficients. For such an initial state it is convenient to use histories 


A = If 0 ] © E„ 

rather than Y and Y’ in (13.7), where 

A' = [to\ O F, 

(13.11) 

E = 7> g ], 

F = £>/] 

(13.12) 


are projectors for the particle to be someplace in the e and / channels, respectively, 
and E t means the particle is in the e channel at time t; see the analogous (12.16). 
As long as t > 6, so that the entire wave packet corresponding to \to) has a chance 
to emerge from the interferometer, one finds that the corresponding probabilities 
are 


Pr(£j) = Pr(A) = [sin(A0/2)] 2 , 
Pr (F t ) = Pr(A') = [cos(A0/2)] 2 , 


(13.13) 
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precisely the same as in (13.8). Since the philosophy behind toy models is sim- 
plicity and physical insight, not generality, we shall use only the simple initial state 
\()o) in what follows, even though a good part of the discussion would hold (with 
some fairly obvious modifications) for a more general initial state representing a 
wave packet entering the interferometer in the a channel. 

What can we say about the particle while it is inside the interferometer, during 
the time interval for which the histories in (13.7) provide no information? There 
are various ways of refining these histories by inserting additional events at times 
between t = 0 and 4. For example, one can employ unitary extensions, Sec. 11.7, 
of Y and Y' by using the unitary time development of the initial |0a) at intermediate 
times to obtain two histories 


7- = [Ofl] O [la] O [2d] © [3 q] © [4c], 

Yf = [Oa] O [la] O [2d] © [3q] O [4/], 

defined at t — 0, 1, 2, 3, 4, which form the support of a consistent family with 
initial state [0a]. The projector [3<?] is onto the state 

|3?> := (e^|3c) + ^|3d»/V2. (13.15) 


The histories in (13.14) are identical up to t = 3, and then split. One can place 
the split earlier, between t = 2 and t = 3, by mapping [4c] and [4/] unitarily 
backwards in time to t = 3: 


Y e = [0a] O [la] © [2d] © [3b] 0 [4c], 
Yf = [0a] 0 [la] O [2d] © [3a] © [4/]. 


Note that Y, Y e , and Y e all have exactly the same chain operator, for reasons dis- 
cussed in Sec. 11.7, and the same is true of Y ' , Yf, and ff. The consistency of 
the family (13.7) is automatic, as only two times are involved, Sec. 11.3. As a 
consequence the unitary extensions (13.14) and (13.16) of that family are supports 
of consistent families; see Sec. 11.7. 

The f ami lies in (13.14) and (13.16) can be used to discuss some aspects of the 
particle’s behavior while inside the interferometer, but cannot tell us whether it was 
in the c or in the d arm, because the projectors C and D, (12.9), do not commute 
with projectors onto superposition states, such as [la], [ 3<y | , or [ 37? ] . Instead, we 
must look for alternative families in which events of the form [me] or [ md ] appear 
at intermediate times. It will simplify the discussion if we assume that (p c — 0 = 
(pd, that is, use 5, for time development rather than the more general S-. 

One consistent family of this type has for its support the two elementary histories 


T c = [0a] © [lc] © [2c] © [3c] © [Ac] O [5c] © • • • [re], 
Y d = [0a] © [Id] © [2d] © [3d] © [Ad] © [5d] © • • • [rd]. 


(13.17) 
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where 

\mc) = (+\me) + \mf))/V2, \md) = (~\me) + \mf})f& (13.18) 

for m > 4 correspond to unitary time evolution starting with |3c) and |3 d), respec- 
tively. The final time r can be as large as one wants, consistent with the particle 
not having passed out of the e or / channels due to the periodic boundary condi- 
tion. The histories in (13.17) are unitary extensions of [Oa] © [lc] and [Oa] © [1 d\, 
and consistency follows from the general arguments given in Sec. 11.7. Note that 
if we use Y c and Y d , we cannot say whether the particle emerges in the e or / 
channel of the second beam splitter, whereas if we use Y e and Y? in (13.14), with 
<p c — 0 — <f>d, we can say that the particle leaves this beam splitter in a definite 
channel, but we cannot discuss the channel in which it arrives at the beam splitter. 

In order to describe the particle as being in a definite arm of the interferometer 
and emerging in a definite channel from the second beam splitter, one might try a 
family which includes 

Y ce = [Oa] © [lc] © [2c] © [3c] © [4c] © [5c] © 

Y c f = [Oa] O [lc] © [2c] © [3c] © [4/] © [5/] © 

Y de = [Oa] O [Id] O [2d] O [3d] © [4c] © [5c] © 

Y d f = [Oa] © [Id] O [2d] O [3d] © [4/] © [5/] © 

continuing till some final time r. Alas, this will not work. The family is inconsis- 
tent, because 

(. K(Y ce ), K(Y de )) ± 0, ( K(Y cf ), K(Y df )) ± 0, (13.20) 

as is easily shown using the corresponding chain kets (Sec. 11.6). In fact, each of 
the histories in (13.19) is intrinsically inconsistent in the sense that there is no way 
of making it part of some consistent family. See the discussion of intrinsic incon- 
sistency in Sec. 11.8; the strategy used there for histories involving three times is 
easily extended to cover the somewhat more complicated situation represented in 
(13.19). 

The analog of (13.14) for two- slit interference is a consistent family T in which 
the particle passes through the slit system in a delocalized state, but arrives at a 
definite location in the diffraction zone. It is T which lies behind conventional dis- 
cussions of two-slit interference, which emphasize (correctly) that in such circum- 
stances it is meaningless to discuss which slit the particle passed through. However, 
there is also another consistent family Q, the analog of (13.17), in which the particle 
passes through one or the other of the two slits, and is described in the diffraction 
zone by one of two delocalized wave packets, the counterparts of the c and cl su- 
perpositions defined in (13.18). Although these wave packets overlap in space, 
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they are orthogonal to each other and thus represent distinct quantum states. The 
families T and Q are incompatible, and hence the descriptions they provide cannot 
be combined. Attempting to do so by assuming that the particle goes through a 
definite slit and arrives at a definite location in the diffraction zone gives rise to 
inconsistencies analogous to those noted in connection with (13.19). 

From the perspective of fundamental quantum theory there is no reason to prefer 
one of these two families to the other. Each has its use for addressing certain types 
of physical question. If one wants to know the location of the particle when it 
reaches the diffraction zone, T must be used in preference to Q, because it is only 
in T that this location makes sense. On the other hand, if one wants to know 
which slit the particle passed through, Q must be employed, for in T the concept 
of passing through a particular slit makes no sense. Experiments can be carried out 
to check the predictions of either family, and the Mach-Zehnder analogs of these 
two kinds of experiments are discussed in the next two sections. 


13.3 Detector in output of interferometer 

Let us add to the e output channel of our toy Mach-Zehnder interferometer a simple 
two-state detector of the type introduced in Sec. 7.4 and used in Sec. 12.2, see 
Fig. 12.2. The detector states are |0e), “ready”, and \\e), “triggered”, and the 
unitary time development operator is 

T = S- R e , (13.21) 

where R e is the identity on the Hilbert space M. 0 £ of particle-plus-detector, 
except for 


R e \4e,ne ) = \4e, (1 -n)e), (13.22) 

with n = 0 or 1, which is the analog of (12.27). Thus, in particular, 

T \4e, Oe) = \5e, le), T\4f, Oe) = |5 /, 01), (13.23) 

so the detector is triggered by the particle emerging in the e channel as it hops 
from 4e to 5e, but is not triggered if the particle emerges in the / channel. We 
could add a second detector for the / channel, but that is not necessary: if the e 
channel detector remains in its ready state after a certain time, that will tell us that 
the particle emerged in the / channel. See the discussion in Sec. 12.2. 

Assume an initial state 


|'I/ 0 > = \0a,0e), 


(13.24) 
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and consider histories that are the obvious counterparts of those in (13.14), 


Z e — Z l © \4e, Oc] O [ 5e , lc] © [6e, lc] © • • • [re, lc], 
Z f = Z ! o [4/, 0c] © [5/, 0c] © [6/, 0c] O • • • [rf, 0c], 


but which continue on to some final time r. The initial unitary portion 

Z‘ = ['To] O [Id, Oe] O [2d, Oe] O [3 q, Oe] (13.26) 

is the same for both Z e and ZA The histories in (13.25) are the support of a con- 
sistent family with initial state I'Tq), and they contain no surprises. If the particle 
passes through the interferometer in a coherent superposition and emerges in chan- 
nel e, it triggers the detector and keeps going. If it emerges in / it does not trigger 
the detector, and continues to move out that channel. The probability that the de- 
tector will be in its triggered state at t = 5 or later is sin 2 (A</>/2), the same as the 
probability calculated earlier, (13.8), that the particle will emerge in the e channel 
when no detector is present. 

As a second example, suppose that (p c = 0 — (pd, and consider the consistent 
family whose support consists of the two histories 

Z c = ['To] O [1 c] O [2c] O [3c] © [4c, Oe] © [5r] © [6 r] © • • • [r r], 

Z d = ['Tq] © [Id] © [2d] © [3d] © [4d, Oe] © [5s] O [6s] © • • • [ts], ' ? 


where the detector state [0c] has been omitted for times earlier than r = 4 fit could 
be included at all these times in both histories), and 


I mr) - 


| me, lc) + | mf, Oe) 

vl ’ 


I ms) - 


— | me, le) + | mf, Oe) 

vl 


(13.28) 


are superpositions of states in which the detector has and has not been triggered, 
so they are toy MQS (macroscopic quantum superposition) states, as in (12.29). 
The histories in (13.27) are obvious counterparts of those in (13.17), and they are 
unitary extensions (Sec. 11.7) to later times of ['To] © [lc, 0c], and ['To] © [Id, 0c]. 

The toy MQS states at time t > 5 in (13.27) are hard to interpret, and their 
grown-up counterparts for a real Mach-Zehnder or neutron interferometer are im- 
possible to observe in the laboratory. Can we get around this manifestation of 
Schrodinger’s cat (Sec. 9.6) by the same method we used in Sec. 12.2: using his- 
tories in which the detector is in its pointer basis (see the definition at the end of 
Sec. 9.5) rather than in some MQS state? The obvious choice would be something 



13.3 Detector in output of interferometer 


185 


like 


Z ce = ['I'o] O [lc] O [2c] © [3c] © [4c, Oc] © [5c, lc] © • • • , 

= [*o] © [lc] O [2c] © [3c] © [4/, Oc] © [5/, Oc] © • • • , 

Z de = [4/ 0 ] O [lc?] O [2c?] © [3c?] O [4c, Oc] © [5c, lc] © • • • , 

Z df = [4/ 0 ] O [lc?] O [2c?] O [3c?] O [4/, Oc] © [5/, Oc] © • • • , 

where, once again, we have omitted the detector state [Oc] at times earlier than 
t — 4. However, this family is inconsistent: (13.20) holds with Y replaced by Z, 
and one can even show that the individual histories in (13.29), like those in (13.19), 
are intrinsically inconsistent. Indeed, the history 


['k 0 ] 0 OC,© [lc]c, (13.30) 

in which the initial state is followed by a particle in the c arm at some time in the 
interval 1 < t < 3, and then the detector in its triggered state at a later time t' > 5, 
is intrinsically inconsistent, and the same is true if C t is replaced by D t , or [1 c]c 
by [0c] r /. (For the meaning of C, or D t , see the discussion following (12.15).) 

A similar analysis can be applied to the analogous situation of two- slit interfer- 
ence in which a detector is located at some point in the diffraction zone. By using 
a family in which the particle passes through the slit system in a delocalized state 
corresponding to unitary time evolution, the analog of (13.25), one can show that 
the probability of detection is the same as the probability of the particle arriving 
at the corresponding region in space in the absence of a detector. There is also a 
family, the analog of (13.27), in which the particle passes through a definite slit, 
and later on the detector is described by an MQS state, the counterpart of one of the 
states defined in (13.28). There is no way of “collapsing” these MQS states into 
pointer states of the detector — this is the lesson to be drawn from the inconsistent 
family (13.29) — as long as one insists upon assigning a definite slit to the particle. 

This example shows that it is possible to construct families of histories using 
events at earlier times which are “normal” (non-MQS), but which have the conse- 
quence that at later times one is “forced” to employ MQS states. If one does not 
want to use MQS states at a later time, it is necessary to change the events in the 
histories at earlier times, or alter the initial states. Note that consistency depends 
upon all the events which occur in a history, because the chain operator depends 
upon all the events, so one cannot say that inconsistency is “caused” by a particular 
event in the history, unless one has decided that other events shall, by definition, 
not share in the blame. 
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13.4 Detector in internal arm of interferometer 

Let us see what happens if a detector is placed in the c arm inside the toy inter- 
ferometer. (A detector could also be placed in the d arm, but this would not lead 
to anything new, since if the particle is not detected in the c arm one can conclude 
that it passed through the d arm.) The detector states are |0c) “ready” and |lc) 
“triggered”. The unitary time operator is 

T = S[R C , (13.31) 

where S[ is defined in (13.5), and R c is the identity on the space M. 0 C of particle 
and detector, except for 


R c \2c,n£) = 1 2c, (1 -n)c). 


(13.32) 


In particular, 

T\2c, Oc) = c'^ c |3c, lc), T\2d, Oc) = e i<l>d \3d, Oc), (13.33) 

so the detector is triggered as the particle hops from 2c to 3c when passing through 
the c arm, but is not triggered if the particle passes through the d arm. 

Consider the unitary time development, 

|<fi r > = r f |<ho>, lOo) = |0 a, Oc), (13.34) 

of an initial state in which the particle is in the a channel, and the c channel detector 
is in its ready state. At t = 4 we have 

| <J> 4 ) = ±(c^|4c, lc) - e i</>d \4e, Oc) + lc) + e ! '^|4/, Oc)), (13.35) 

where all four states in the sum on the right side are mutually orthogonal. 

One can use (13.35) as a pre-probability to compute the probabilities of two- 
time histories beginning with the initial state |<£>o) at t = 0, and with the particle 
in either the e or the / channel at t = 4. Thus consider a family in which the four 
histories with nonzero weight are of the form 4> 0 O \<Pj K where \<f>j) is one of the 
four kets on the right side of (13.35). Each will occur with probability 1/4, and 
thus 


Pr([4c] 4 ) = 1/4 + 1/4 = 1/2 = Pr([4/] 4 ). (13.36) 

Upon comparing these with (13.8) when no detector is present, one sees that in- 
serting a detector in one arm of the interferometer has a drastic effect: there is no 
longer any dependence of these probabilities upon the phase difference A <£. Thus 
a measurement of which arm the particle passes through wipes out all the interfer- 
ence effects which would otherwise be apparent in the output intensities following 
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the second beam splitter. Note the analogy with Feynman’s discussion of the dou- 
ble slit: determining which slit the electron goes through, by scattering light off of 
it, destroys the interference pattern in the diffraction zone. 

Now let us consider various possible histories describing what the particle does 
while it is inside the interferometer, assuming (p c = 0 = (pd in order to simplify the 
discussion. Straightforward unitary time evolution will result in a family in which 
every [<I> f ] for t > 3 is a toy MQS state involving both |0c) and the triggered state 
|lc) of the detector. In order to obtain a consistent family without MQS states, 
we can let unitary time development continue up until the measurement occurs, 
and then have a split (or collapse) to produce the analog of (12.33) in the previous 
chapter: a family whose support consists of the two histories 

V c = [0 a, 0c] © [Id, 0c] © [2d, 0c] O [3c, lc] © [4c, lc] © • • • , 

V d - [0a, 0c] © [Id, 0c] © [2d, 0c] © [3d, 0c] © [4d, 0c] © • • • , (13 ' 37) 

with states | me) and |md) defined in (13.18). One can equally well put the split at 
an earlier time, by using histories 


Z c = [0a, 0c] © [lc, 0c] © [2c, 0c] © [3c, lc] © [4c, lc] © • • • , 
Z d = [0a, 0c] © [Id, 0c] © [2d, 0c] © [3d, 0c] © [4d, 0c] © • • • , 


(13.38) 


which resemble those in (13.17) in that the particle is in the c or in the d arm from 
the moment it leaves the first beam splitter. 

One can also introduce a second split at the second beam splitter, to produce a 
family with support 


Z ce — [0a, 0c] © [lc, 0c] © [2c, 0c] 

Z c f = [0a, 0c] © [lc, 0c] © [2c, 0c] 

Z de = [0a, 0c] © [Id, 0c] © [2d, 0c] 

Z d f = [0a, 0c] © [Id, 0c] © [2d, 0c] 


3 [3c, lc] © [4c, lc] © [5c, lc] © • • • , 

© [3c, lc] © [4/, lc] © [5/, lc] © • • • , 

© [3d, 0c] © [4c, 0c] © [5c, 0c] © • • • , 

© [3d, 0c] © [4/, 0c] © [5/, 0c] © • • • . 

(13.39) 


This family is consistent, in contrast to (13.19), because the projectors of the dif- 
ferent histories at some final time r are mutually orthogonal: the orthogonal final 
states of the detector prevent the inconsistency which would arise, as in (13.20), if 
one only had particle states. In addition, one could place another detector in one 
of the output channels. However, when used with a family analogous to (13.39) 
this detector would simply confirm the arrival of the particle in the corresponding 
channel with the same probability as if the detector had been absent, so one would 
learn nothing new. 

Inserting a detector into the c arm of the interferometer provides an instance 
of what is often called decoherence. The states |md) and | mb) defined in (13.4) 
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are coherent superpositions of the states | me) and \md) in which the particle is 
localized in one or the other arm of the interferometer, and the relative phases in 
the superposition are of physical significance, since in the absence of a detector one 
of these superpositions will result in the particle emerging in the / channel, and 
the other in its emerging in e. However, when something like a cosmic ray interacts 
with the particle in a sufficiently different way in the c and the d arm, it destroys 
the coherence (the influence of the relative phase), and thus produces decoherence. 

The scattering of light in Feynman’s version of the double-slit experiment is 
an example of decoherence in this sense, and it results in interference effects be- 
ing washed out. However, decoherence is usually not an “all or nothing” affair. 
The weakly-coupled detectors discussed in Sec. 13.5 provide an example of par- 
tial decoherence. As well as washing out interference effects, decoherence can ex- 
pand the range of possibilities for constructing consistent families. Thus the family 
based on (13.19) in which the particle is in a definite arm inside the interferometer 
and emerges from the interferometer in a definite channel is inconsistent, whereas 
its counterpart in (13.39), with decoherence taking place inside the interferometer, 
is consistent. Some additional discussion of decoherence will be found in Ch. 26. 


13.5 Weak detectors in internal arms 

As noted in Sec. 13.1, Feynman in his discussion of double-slit interference tells 
us that as the intensity of the light behind the double slits is reduced, one will find 
that those electrons which do not scatter a photon will, when they arrive in the 
diffraction zone, exhibit the same interference pattern as when the light is off. Let 
us try to understand this effect by placing weakly-coupled or weak detectors in the 
c and d arms of the toy Mach-Zehnder interferometer. 

A simple toy weak detector has two orthogonal states, |0c) “ready” and |lc) 
“triggered”, and the weak coupling is arranged by replacing the unitary transfor- 
mation R c in (13.32) with R' c , which is the identity except for 

K 12c, Oc) = or|2c, 0 c) + P\2 c, 1 c), 

R' c \2c, \c) — y\2c, Oc) + <5|2c, lc), ' ” 

where a, y, and S are (in general complex) numbers forming a unitary 2x2 
matrix 


a 0 

Y S 


(13.41) 


The “strongly-coupled” or “strong” detector used previously is a special case in 
which fi — \ — y, a — 8 — 0. Making |/3| small results in a weak coupling, since 
the probability that the detector will be triggered by the presence of a particle at site 
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2c is |yS | 2 . (One can also modify the time-elapse detector of Sec. 12.3 to make it a 
weakly-coupled detector, by modifying (12.40) in a manner analogous to (13.40), 
but we will not need it for the present discussion.) It is convenient for purposes of 
exposition to assume a symmetrical arrangement in which there is a second detec- 
tor, with ready and triggered states \()d) and 1 1 d), in the d arm of the interferometer, 
with its coupling to the particle governed by a unitary transformation R' d equal to 
the identity except for 

R' d \2d, 0 d) = a\2 d, 0 d) + p\2d, Id), 

R' d \2d, Id) = y\2d,0d) + 8\2d, Id), 

where the numerical coefficients a, /}, y, and 8 are the same as in (13.40). 

The overall unitary time development of the entire system A 1 0C0D consisting 
of the particle and the two detectors is determined by the operator 

T = SiR' c R' d = SjR d R' c , (13.43) 

where 5,- (rather than S') means the phase shifts <p c and </y/ are 0. The unitary time 
evolution, 

\Q t ) = T* \Q 0 ), \Q 0 ) = |0 a, 0 c, 0 d), (13.44) 

of an initial state | Qq) in which the particle is at [0a] and both detectors are in their 
ready states results in 

|« 4 ) = a |4/, 0c, Od) 

+ |>S(|4e, lc, Od) + 1 4/, 1 c, 0 d) - \4e, 0c, Id) + |4 /, 0c, Id)) (13.45) 

at t — 4; for any later time \Q t ) >s given by the same expression with 4 replaced by 
t. 

Consider a family of two-time histories with initial state | f2 0 ) at t = 0, and at t = 
4 a decomposition of the identity in which each detector is in a pointer state (ready 
or triggered) and the particle emerges in either the e or the / channel. Consistency 
follows from the fact that there are only two times, and the probabilities can be 
computed using ( 1 3 .45) as a pre-probability. There is a finite probability | a | 2 that at 
t = 4 neither detector has detected the particle, and in this case it always emerges 
in the / channel. On the other hand, if the particle has been detected by the c 
detector, it will emerge with equal probability in either the e or the / channel, and 
the same is true if it has been detected by the d detector. 

All of this agrees with Feynman’s discussion of electrons passing through a dou- 
ble slit and illuminated by a weak light source. Emerging in the / channel rather 
than the e channel is what happens when no detectors are present inside the in- 
terferometer, and represents an interference effect. By contrast, detection of the 
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particle in either arm washes out the interference effect, and the particle emerges 
with equal probability in either the e or the / channel. Note that the probability 
is zero that both detectors will detect the particle. This is what one would expect, 
since the particle cannot be both in the c arm and in the d arm of the interferometer; 
quantum particles are never in two different places at the same time. 

Additional complications arise when there is a weakly-coupled detector in only 
one arm, or when the numerical coefficients in (13.42) are different from those in 
(13.40). Sorting them out is best done using T — S'- R' c R’ d or T — S- R' c in place 
of (13.43), and thinking about what happens when the phase shifts <f> c and <pd are 
allowed to vary. Exploring this is left to the reader. 

When weakly-coupled detectors are present, what can we say about the particle 
while it is inside the interferometer ? Again assume, for simplicity, that 4> c and <pd 
are zero. There are many possible frameworks, and we shall only consider one 
example, a consistent family whose support consists of the five histories 

[£2 0 ] O [1 c, 0c, Od] © [2c, 0c, 0 d] Q [3c, lc, Od] © [4c, lc, 0 d], 

[£2 0 ] O [lc, 0c, 0 d] O [2c, 0c, 0 d] O [3c, lc, 0 d] O [4/, lc, 0 d], 

[£2 0 ] O [Id, 0c, 0 d] O [2d, 0 c, Od] © [3d, 0 c, Id] O [4c, 0c, Id], (13.46) 
[£2 0 ] O [Id, 0c, 0 d] O [2d, 0c, 0 d] © [3d, 0 c, Id] O [4/, 0c, Id], 

[£2 0 ] O £l a, 0 c, 0 d] O [2d, 0 c, 0 d] O [3d, 0c, 0 d] O [4/, 0c, 0 d], 

(Consistency follows from the orthogonality of the final projectors, Sec. 11.3.) 
Using this family one can conclude that if at t = 4 the c detector has been triggered, 
the particle was earlier ( t — 1, 2, or 3) in the c arm; if the cl detector has been 
triggered, the particle was earlier in the d arm; and if neither detector has been 
triggered, the particle was earlier in a superposition state | a). The corresponding 
statements for Feynman’s double slit with a weak light source would be that if a 
photon scatters off an electron which has just passed through the slit system, then 
the electron previously passed through the slit indicated by the scattered photon, 
whereas if no photon scatters off the electron, it passed through the slit system in a 
coherent superposition. 

While these results are not unreasonable, there is nonetheless something a bit 
odd going on. The projector [Id, 0 c, 0 d] at time t — 1 in the last history in (13.46) 
does not commute with the projectors at t — 1 in the other histories, even though 
the projectors for the histories themselves (on the history space if) do commute 
with each other, since their products are 0. This means that the Boolean algebra 
associated with (13.46) does not contain the projector [Id] i for the particle to be 
in a coherent superposition state at the time t — 1, nor does it contain [ 1 c ] i or 
[ld]i. Thus the events at t = 1, and also at t — 2 and t — 3, in these histories 
are dependent or contextual in the sense employed in Sec. 6.6 when discussing 
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(6.55). Within the framework represented by (13.46), they only make sense when 
discussed together with certain later events; they depend on the later outcomes of 
the weak measurements in a sense which will be discussed in Ch. 14. 



14 

Dependent (contextual) events 


14.1 An example 

Consider two spin-half particles a and b, and suppose that the corresponding 
Boolean algebra C of properties on the tensor product space A 0 B is generated by 
a sample space of four projectors, 

[ztlQlzt L [£]®[zi L \z~]®\xtl [z~]®\x;i (14.1) 

which sum to the identity operator 7 0 7. Let A — [z+ ] be the property that 
S az = +1/2 for particle a, and its negation A = I — A — [z~ I the property that 
S az = —1/2. Likewise, let B = [z ; |] and B = 7 — B = [z/] be the properties 
Sb z = +1/2 and Sb z — — 1/2 for particle 7>. Together with the projectors A B and 
AB, the first two items in (14.1), the Boolean algebra C also contains their sum 

A = AB + AB (14.2) 

and its negation A. On the other hand C does not contain the projector B or its 
negation B, as is obvious from the fact that these operators do not commute with 
the last two projectors in (14.1). Thus when using the framework C one can discuss 
whether S az is +1/2 or —1/2 without making any reference to the spin of particle 
b. But it only makes sense to discuss whether Sb z is +1/2 or —1/2 when one 
knows that S az — +1/2. That is, one cannot ascribe a value to .S/, in an absolute 
sense without making any reference to the spin of particle a. 

If it makes sense to talk about a property B when a system possesses the property 
A but not otherwise, we shall say that B is a contextual property: it is meaningful 
only within a certain context. Also we shall say that B depends on A, and that A is 
the base of B. (One might also call A the support of B.) A slightly more restric- 
tive definition is given in Sec. 14.3, and generalized to contextual events which do 
not have a base. It is important to notice that contextuality and the corresponding 
dependence is very much a function of the Boolean algebra C employed for con- 
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structing a quantum description. For example, the Boolean algebra CJ generated 
by 

U+]0[4L [zj]0[z*], [z~]®[ztl [z^]0[z*] (14.3) 

contains both A = [z+] and 6 = and thus in this algebra B does not depend 
upon A. And in the algebra C" generated by 

[zj]®[z£], [z“] ® [z^], [r a + ]®[z t 1, [x;] 0 [vl, (14.4) 

the property A is contextual and depends on B. 

Since quantum theory does not prescribe a single “correct” Boolean algebra of 
properties to use in describing a quantum system, whether or not some property 
is contextual or dependent on another property is a consequence of the physicist’s 
choice to describe a quantum system in a particular way and not in some other way. 
In particular, when B depends on A in the sense we are discussing, one should not 
think of B as being caused by A, as if the two properties were linked by a physical 
cause. The dependence is logical, not physical, and has to do with what other 
properties are or are not allowed as part of the description based upon a particular 
Boolean algebra. 


14.2 Classical analogy 

It is possible to construct an analogy for quantum contextual properties based on 
purely classical ideas. The analogy is somewhat artificial, but even its artificial 
character will help us understand better why dependency is to be expected in quan- 
tum theory, when it normally does not show up in classical physics. Let x and 
y be real numbers which can take on any values between 0 and 1, so that pairs 
(x, y) are points in the unit square, Fig. 14.1. In classical statistical mechanics one 
sometimes divides up the phase space into nonoverlapping cells (Sec. 5.1), and in 
a similar way we shall divide up the unit square into cells of finite area, and regard 
each cell as an element of the sample space of a probabilistic theory. The sample 
space corresponding to the cells in Fig. 14.1(a) consists of four mutually-exclusive 
properties: 

{0 < x < 1/2, 0 < y < 1/2}, {0 < x < 1/2, 1/2 < y < 1}, 

{1/2 <x < 1, 0 < y < 1/2}, {1/2 <x < 1, 1/2 <y < 1}. l ' + ’ 

Let A be the property 0 < x < 1/2, so its complement Aisl/2<x<l, and let 
B be the property 0 < y < 1/2, so B is 1/2 < y < 1. Then the four sets in (14.5) 
correspond to the properties A A B, A A B, A A B, A A B. It is then obvious that 
the Boolean algebra of properties generated by (14.5) contains both A and B, so 
(14.5) is analogous in this respect to the quantum sample space (14.3). 
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Fig. 14.1. Unit square in the x, y plane: (a) shows the set of cells in (14.5), (b) the set 
of cells in (14.6), and (c) the cells in a common refinement (see text). Property A is 
represented by the vertical rectangular cell on the left, and B by the horizontal rectangular 
cell (not present in (b)) on the bottom. The gray region represents A A B. 


An alternative choice for cells is shown in Fig. 14.1(b), where the four mutually- 
exclusive properties are 


{0 < x < 1/2, 0 < y < 1/2}, {0 < x < 1/2, 1/2 < y < 1}, 

{1/2 < x < 1, 0 < y < 2/3}, {1/2 < x < 1, 2/3 < y < 1}. 


(14.6) 


If A and B are defined in the same way as before, the new algebra of properties 
generated by (14.6) contains A and A A B, but does not contain B. In this respect 
it is analogous to (14.1) in the quantum case, and B is a contextual or dependent 
property: it only makes sense to ask whether the system has or does not have the 
property B when the property A is true, that is, when x is between 0 and 1/2, but 
the same question does not make sense when x is between 1 /2 and 1, that is, when 
A is false. 

Isn’t this just some sort of formal nitpicking? Why not simply refine the sample 
space of Fig. 14.1(b) by using the larger collection of cells shown in Fig. 14.1(c)? 
The corresponding Boolean algebra of properties includes all those in (14.6), so 
we have not lost the ability to describe whatever we would like to describe, and 
now B as well as A is part of the algebra of properties, so dependency is no longer 
of any concern. Such a refinement of the sample space can always be employed in 
classical statistical mechanics. However, a similar type of refinement may or may 
not be possible in quantum mechanics. There is no way to refine the sample space 
in (14.1), for the four projectors in that list already project onto one-dimensional 
subspaces, which is as far as a quantum refinement can go. The move from (b) to 
(c) in Fig. 14.1, which conveniently gets rid of contextual properties in a classical 
context, will not work in the case of (14.1); the latter is an example of an irreducible 
contextuality. 

To be more specific, the refinement in Fig. 14.1(c) is obtained by forming the 
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products of the indicators for B, B, B' , and B' with one another and with A and 
A, where B' is the property 0 < y < 2/3. The analogous process for (14.1) would 
require taking products of projectors such as [z^~ ] and [x/], but since they do not 
commute with each other, their product is not a projector. That noncommutativity 
of the projectors is at the heart of the contextuality associated with (14.1) can also 
be seen by considering two classical spinning objects a and b with angular mo- 
menta L a and L b , and interpreting [z+] and [z“] in (14.1) as L az > 0 and L az < 0, 
etc. In the classical case there is no difficulty refining the sample space of (14.1) 
to get rid of dependency, for ][x/] is the property L bz > 0 a L bx > 0, which 
makes perfectly good (classical) sense. But its quantum counterpart for a spin-half 
particle has no physical meaning. 


14.3 Contextual properties and conditional probabilities 

If A and B are elements of a Boolean algebra £ for which a probability distribution 
is defined, then 

Pr(£ | A) = Pr(AB)/Pr(A) (14.7) 

is defined provided Pr(A) is greater than 0. If, however, B is not an element of £, 
then Pr(B) is not defined and, as a consequence, Pr(A | B) is also not defined. In 
view of these remarks it makes sense to define B as a contextual property which 
depends upon A, A is the base of B, provided Pr(B | A) is positive (which implies 
Pr(AB) > 0), whereas Pr(B) is undefined. This definition is stricter than the one in 
Sec. 14. 1, but the cases it eliminates — those with Pr(B | A) = 0 — are in practice 
rather uninteresting. In addition, one is usually interested in situations where the 
dependence is irreducible, that is, it cannot be eliminated by appropriately refining 
the sample space, unlike the classical example in Sec. 14.2. 

One can extend this definition to events which depend on other contextual events. 
For example, let A, B, and C be commuting projectors, and suppose A, AB, and 
ABC belong to the Boolean algebra, but B and C do not. Then as long as 

Pr(C | AB) = Pr (ABC)/ Pr(AB) (14.8) 

is positive, we shall say that C depends on B (or on AB), and B depends on A. 
Note that if (14.8) is positive, so is Pr(AB), and thus Pr(£ | A), (14.7), is also 
positive. 

There are situations in which the properties A and B, represented by commuting 
projectors, are contextual even though neither can be said to depend upon or be the 
base of the other. That is, AB belongs to the Boolean algebra £ and has positive 
probability, but neither A nor B belongs to £. In this case neither Pr(A | B) nor 
Pr(B | A) is defined, so one cannot say that B depends on A or A on B, though one 
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might refer to them as “codependent”. As an example, let A and B be two Hilbert 
spaces of dimension 2 and 3, respectively, with orthonormal bases [|0a), |1 a)} and 
[|0£>), |1 b), |2 b)}. In addition, define 

1+^) = (|0fe> + |l£>})/\/2, \-b) = (\0b)-\lb))/y/2, (14.9) 

and | +a) and |— a) in a similar way. Then the six kets 

|0a) 0 |0&), \la)®\+b), \+a)®\2b), 

|0a) 0 \\b), | la) 0 \—b), \-a)®\2b), ' ' 

form an orthonormal basis for A® B, and the corresponding projectors generate 
a Boolean algebra £. If A = [Oa] 0 I and B — I ® | OA | , then £ contains AB, 
corresponding to the first ket in (14.10), but neither A nor B belongs to £, since 
[0a] does not commute with [+a], and \0b\ does not commute with \+b\. More 
complicated cases of “codependency” are also possible, as when £ contains the 
product ABC of three commuting projectors, but none of the six projectors A, B, 
C, AB, BC, and AC belong to £. 


14,4 Dependent events in histories 

In precisely the same way that quantum properties can be dependent upon other 
quantum properties of a system at a single time, a quantum event — a property of a 
quantum system at a particular time — can be dependent upon a quantum event at 
some different time. That is, in the family of consistent histories used to describe 
the time development of a quantum system, it may be the case that the projector 
B for an event at a particular time does not occur by itself in the Boolean algebra 
£ of histories, but is only present if some other event A at some different time is 
present in the same history. Then B depends on A, or A is the base of B, using 
the terminology introduced earlier. And there are situations in which a third event 
C at still another time depends on B, so that it only makes sense to discuss C as 
part of a history in which both A and B occur. Sometimes this contextuality can be 
removed by refining the history sample space, but in other cases it is irreducible, 
either because a refinement is prevented by noncommuting projectors, or because 
it would result in a violation of consistency conditions. 

Families of histories often contain contextual events that depend upon a base 
that occurs at an earlier time. Such a family is said to show “branch dependence”. 
A particular case is a family of histories with a single initial state 'To- If one uses 
the Boolean algebra suggested for that case in Sec. 11.5, then all the later events 
in all the histories of interest are (ultimately) dependent upon the initial event 'To- 
This is because the only history in which the negation 'To = / — 4*0 of the initial 
event occurs is the history Z in (11.14), and in that history only the identity occurs 
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Fig. 14.2. Upper and lower beams emerging from a Stern-Gerlach magnet SG. An atom 
in the lower beam passes through an additional region of uniform magnetic field M. The 
square boxes indicate regions in space, and the time when the atom will pass through a 
given region is indicated at the bottom of the figure. 


at later times. It may or may not be possible to refine such a family in order to 
remove some or all of the dependence upon + 0 . 

An example of branch dependence involving something other than the initial 
state is shown in Fig. 14.2. A spin-half particle passes through a Stern-Gerlach 
magnet (Sec. 17.2) and emerges moving at an upwards angle if S z = +1/2, or 
a downwards angle if S z — —1/2. Let E and F be projectors on two regions in 
space which include the upward- and downward-moving wave packets at time h, 
assuming a state I'I'o) (space-and-spin wave function of the particle) at time ? 0 . In 
the interval between t\ and t 2 the downward-moving wave packet passes through 
a region M of uniform magnetic field which causes the spin state to rotate by 90° 
from S z — —1/2 to S x — +1/2. This situation can be described using a consistent 
family whose support is the two histories 


^oOFO [z + ], 

^oQFQ[x+]. 


which can also be written in the form 


'FoO 


pO[z + ], 

1+0 [x+], 


(14.11) 


(14.12) 


where the initial element common to both histories is indicated only once. Con- 
sistency follows from the fact that the spatial wave functions at the final time t 2 
have negligible overlap, even though they are not explicitly referred to in (14.12). 
Whatever may be the zero-weight histories, it is at once evident that neither of the 
two histories 


%0iO[z + ], O / O [x + ] (14.13) 


can occur in the Boolean algebra, since the projector for the first history in (14.13) 
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does not commute with that for the second history in (14.12), and the second his- 
tory in (14.13) is incompatible with the first history in (14.12). Consequently, in 
the consistent family (14.12) \z, + ] at t 2 depends upon E at t\, and [x + ] at t 2 depends 
upon F at t\ . Furthermore, as the necessity for this dependency can be traced to 
noncommuting projectors, the dependency is irreducible: one cannot get rid of it 
by refining the consistent family. 

An alternative way of thinking about the same gedanken experiment is to note 
that at t 2 the wave packets do not overlap, so we can find mutually orthogonal 
projectors E and F on nonoverlapping regions of space, Fig. 14.2, which include 
the upward- and downward-moving parts of the wave packet at this time. Consider 
the consistent family whose support is the two histories 

%0/0{[z + ]£,[r + ]f}, (14.14) 


where the notation is a variant of that in (14.12): the two events inside the curly 
brackets are both at the time t 2 , so one history ends with the projector {z + ]E, the 
other with the projector [x + ]F. Once again, the final spin states [z + | and \x + 1 are 
dependent events, but now \z + 1 depends upon E and \x + ] upon F, so the bases 
occur at the same time as the contextual events which depend on them. This is 
a situation which resembles (14.1), with E and F playing the roles of [z+] and 
[z“], respectively, while the spin projectors in (14.14) correspond to those of the b 
particle in (14.1). One could also move the regions E and F further to the right in 
Fig. 14.2, and obtain a family of histories 


^oO/O 


( [z + ] O E, 
l[x+]OF, 


(14.15) 


for the times to < h < t 2 < t 2 , in which [z + ] and [x + ] are dependent on the later 
events E and F. 

Dependence on later events also arises, for certain families of histories, in the 
next example we shall consider, which is a variant of the toy model discussed 
in Sec. 13.5. Figure 14.3 shows a device which is like a Mach-Zehnder interfero- 
meter, but the second beam splitter has been replaced by a weakly-coupled measur- 
ing device M, with initial (“ready”) state | M). The relevant unitary transformations 
are 


|4/ 0 ) = | 0 a) 0 | M) h* (|lc) + |ld})/V2 0 |M> (14.16) 

for the time interval to to t\ , and 

|lc) 0 |M> |2 /) 0 (| M) + |M C })/V2, 

\\d) 0 \M) i-> \2e) 0 (|M) + |M rf ))/V2 
for t\ to t 2 . Here |0a) is a wave packet approaching the beam splitter in channel a at 
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to, 1 1c) is a wave packet in the c arm at time t\, and so forth. The time t\ is chosen 
so that the particle is inside the device, somewhere between the initial beam splitter 
and the detector M, whereas at t 2 it has emerged in e or /. The states | M), \M C ), 
and | M d ) of the detector are mutually orthogonal and normalized. Combining 
(14.16) and (14.17) yields a unitary time development 

I'l'o) h* (\2e) 0 | M d ) + |2 /) 0 | M c ) + V2|2s}|M})/2 (14.18) 

from to to t 2 , where 

|2s) = (|2c) + |2/))/V2 (14.19) 

is a superposition state of the final particle wave packets. 



Fig. 14.3. Mach-Zehnder interferometer with the second beam splitter replaced by a mea- 
suring device M. 

Consider the consistent family for to < t\ < t 2 whose support is the three 
histories 

4 / o O 7 O {[2c] 0 M d , [2/] 0 M c , [2s] 0M}. (14.20) 

Since the projector [2s] does not commute with the projectors [ 2e ] and [2/], it is 
clear that [2e], [2/], and [2s] are dependent upon the detector states M d , M c , and 
M at the (same) time t 2 , and one has conditional probabilities 

Pr(2e | M d ) - Pr(2 / | M|) = Pr(2s | M 2 ) = 1. (14.21) 

On the other hand, Pr (M| | 2e), Pr(M| | 2/), and Pr(M 2 | 2s) are not defined. (Fol- 
lowing our usual practice, 4'o is not shown explicitly as one of the conditions.) One 
could also say that 2e and 2/ are both dependent upon the state M' with projector 
M c + M d , corresponding to the fact that the detector has detected something. 

Some understanding of the physical significance of this dependency can be ob- 
tained by supposing that later experiments are carried out to confirm (14.21). One 
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can check that the particle emerging from M is in the e channel if the detector state 
is M d , or in / if the detector is in M c , by placing detectors in the e and / channels. 
One could also verify that the particle emerges in the superposition state j in a 
case in which it is not detected (the detector is still in state M at h) by the strategy 
of adding two more mirrors to bring the e and / channels back together again at 
a beam splitter which is followed by detectors. Of course, this last measurement 
cannot be carried out if there are already detectors in the e and / channels, reflect- 
ing the fact that the property 2s is incompatible with 2e and 2/. (A similar pair of 
incompatible measurements is discussed in Sec. 18.4, see Fig. 18.3.) 

An alternative consistent family for to < h < ^ has support 

[lc] © M c , 

'Fo O [W] O M d , (14.22) 

[lr] ©M, 

where 

|lr} = (|lc) + |W»/V2 (14.23) 

is a superposition state of the particle before it reaches M. From the fact that [lr] 
does not commute with [lc] or [1 d], it is obvious that the particle states at the 
intermediate time t\ in (14.22) must depend upon the later detector states: [lc] 
upon M c , [lc?] upon M d , and [lr] upon M. Indeed, 

Pr(lc | M|) = Pr(l d \ M d ) - Pr(lr | M 2 ) = 1, (14.24) 

whereas Pr(M( | lc), Pr (M d \ Id) and Pr(M 2 | lr), the conditional probabilities 
with their arguments in reverse order, are not defined. A very similar dependence 
upon later events occurs in the family (13.46) associated with weak measurements 
in the arms of a Mach-Zehnder interferometer, Sec. 13.5. 

It may seem odd that earlier contextual events can depend on later events. Does 
this mean that the future is somehow influencing the past? As already noted in 
Sec. 14.1, it is important not to confuse the term depends on, used to character- 
ize the logical relationship among events in a consistent family, with a notion of 
physical influence or causality. The following analogy may be helpful. Think 
of a historian writing a history of the French revolution. He will not limit him- 
self to the events of the revolution itself, but will try and show that these events 
were preceded by others which, while their significance may not have been evident 
at the time, can in retrospect be seen as useful for understanding what happened 
later. In selecting the type of prior events which enter his account, the historian 
will use his knowledge of what happened later. It is not a question of later events 
somehow “causing” the earlier events, at least as causality is ordinarily understood. 
Instead, those earlier events are introduced into the account which are useful for 
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understanding the later events. While classical histories cannot provide a perfect 
analogy with quantum histories, this example may help in understanding how the 
earlier particle states in (14.22) can be said to “depend on” the later states of M 
without being “caused by” them. 

To be sure, one often encounters quantum descriptions in which earlier events, 
such as the initial state, are the bases of later dependent events, and it is rather 
natural in such cases to think of (at least some of) the later events as actually caused 
by the earlier events. This may be why later contextual events that depend on earlier 
events somehow seem more intuitively reasonable than the reverse. Nonetheless, 
the ideas of causation and contextuality are quite distinct, and confusing the two 
can lead to paradoxes. 
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15.1 Introduction 

Density matrices are employed in quantum mechanics to give a partial descrip- 
tion of a quantum system, one from which certain details have been omitted. For 
example, in the case of a composite quantum system consisting of two or more 
subsystems, one may find it useful to construct a quantum description of just one 
of these subsystems, either at a single time or as a function of time, while ignor- 
ing the other subsystem(s). Or it may be the case that the exact initial state of a 
quantum system is not known, and one wants to use a probability distribution or 
pre-probability as an initial state. 

Probability distributions are used in classical statistical mechanics in order to 
construct partial descriptions, and density matrices play a somewhat similar role in 
quantum statistical mechanics, a subject which lies outside the scope of this book. 
In this chapter we shall mention a few of the ways in which density matrices are 
used in quantum theory, and discuss their physical significance. 

Positive operators and density matrices were defined in Sec. 3.9. To recapitulate, 
a positive operator is a Hermitian operator whose eigenvalues are nonnegative, and 
a density matrix p is a positive operator whose trace (the sum of its eigenvalues) is 
1 . If R is a positive operator but not the zero operator, its trace is greater than 0, 
and one can define a corresponding density matrix by means of the formula 

p = R/Tr{R). (15.1) 

The eigenvalues of a density matrix p must lie between 0 and 1 . If one of the eigen- 
values is 1, the rest must be 0, and p — p 2 is a projector onto a one-dimensional 
subspace of the Hilbert space. Such a density matrix is called a pure state. Other- 
wise there must be at least two nonzero eigenvalues, and the density matrix is called 
a mixed state. 
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Density matrices very often function as pre-probabilities which can be used to 
generate probability distributions in different bases, and averages of different ob- 
servables. This is discussed in Sec. 15.2. Density matrices arise rather naturally 
when one is trying to describe a subsystem A of a larger system A ® B, and 
Secs. 15.3-15.5 are devoted to this topic. The use of a density matrix to describe 
an isolated system is considered in Sec. 15.6. Section 15.7 on conditional density 
matrices discusses a more advanced topic related to correlations between subsys- 
tems. 


15.2 Density matrix as a pre-probability 

Recall that in some circumstances a quantum wave function or ket | \jr) need not 
denote an actual physical property [\/s] of the quantum system; instead it can serve 
as a pre-probability, a mathematical device which allows one to calculate various 
probabilities. See the discussion in Sec. 9.4, and various examples in Sec. 12.1 and 
Ch. 13. In most cases (see the latter part of Sec. 15.6 for one of the exceptions) a 
density matrix is best thought of as a pre-probability. Thus while it provides useful 
information about a quantum system, one should not think of it as corresponding 
to an actual physical property; it does not represent “quantum reality”. For this 
reason, referring to a density matrix as the “state” of a quantum system can be 
misleading. However, in classical statistical mechanics it is customary to refer 
to probability distributions as “states”, even though a probability distribution is 
obviously not a physical property, and hence it is not unreasonable to use the same 
term for a density matrix functioning as a quantum pre-probability. 

A density matrix which is a pre-probability can be used to generate a proba- 
bility distribution in the following way. Given a sample space corresponding to a 
decomposition of the identity 

/ = J2 pj (15.2) 

j 

into orthogonal projectors, the probability of the property P j is 

Pj = Tr (Pjppj) = Tr(pPi), (15.3) 

where the traces are equal because of cyclic permutation, Sec. 3.8. The operator 
P-i pPl is positive — use the criterion (3.86) — and therefore its trace, the sum 
of its eigenvalues, cannot be negative. Thus (15.3) defines a set of probabilities: 
nonnegative real numbers whose sum, in view of (15.2), is equal to 1, the trace of 
p. In particular, if for each j the projector pj = [j 1 is onto a state belonging to an 
orthonormal basis {| j)}, then 

Pj = Tr(p|j)(j)) = U\p\j) (15.4) 
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is the ;th diagonal element of p in this basis. Hence the diagonal elements of p in 
an orthonormal basis form a probability distribution when this basis is used as the 
quantum sample space. As a special case, the probabilities given by the Born rule, 
Secs. 9.3 and 9.4, are of the form (15.4) when p = l^iX^il and I j) — I 4>() in the 
notation used in (9.35). 

From (15.3) it is evident that the average (V), see (5.42), of an observable 

V = V f = J2 v J pj (15.5) 

j 

can be written in a very compact form using the density matrix: 

<^> = = Tr (pV). (15.6) 

j 

If p is a pure state I (V^i I, then (V) is (V' , i| VjV'i)> as in (9.38). It is worth empha- 
sizing that while the trace in (15.6) can be carried out using any basis, interpreting 
(V) as the average of a physical variable requires at least an implicit reference to 
a basis (or decomposition of the identity) in which V is diagonal. Thus if two ob- 
servables V and W do not commute with each other, the two averages (V) and (IT) 
cannot be thought of as pertaining to a single (stochastic) description of a quantum 
system, for they necessarily involve incompatible quantum sample spaces, and thus 
different probability distributions. The comments made about averages in Ch. 9 
while discussing the Bom rule, towards the end of Sec. 9.3 and in connection with 
(9.38), also apply to averages calculated using density matrices. 


15.3 Reduced density matrix for subsystem 

Suppose we are interested in a composite system (Ch. 6) with a Hilbert space A®B. 
For example, A might be the Hilbert space of a particle, and B that of some system 
(possibly another particle) with which it interacts. At to let I'l'o) be a normalized 
state of the combined system which evolves, by Schrodinger’s equation, to a state 
|'lq) at time t \ . Assume that we are interested in histories for two times, to and t\, 
of the form T 0 O (A- 7 0 7), where 'To stands for the projector [ T 0 1 = I'I'oX'I'ol and 
the A ; form a decomposition of the identity of the subsystem A: 

i a = J2 aj - < 15 - 7 ) 

j 

The probability that system A will have the property A- 7 at t\ can be calculated 
using the generalization of the Bom rule found in (10.34): 

Pr(A') = (vFi|A^ 0 7|'F 1 ) = Tr[^i(A J 0 /)]. (15.8) 

The trace on the right side of (15.8) can be carried out in two steps, see Sec. 6.5: 
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first a partial trace over B to yield an operator on A, followed by a trace over A. In 
the first step the operator A J , since it acts on A rather than B, can be taken out of 
the trace, so that 

TT B [^AA j 0 /)] = pA j , (15.9) 

where 

p = Ti-bOPO (15.10) 

is called the reduced density matrix, because it is used to describe the subsystem 
A rather than the whole system A 0 B. Since p is the partial trace of a positive 
operator, it is itself a positive operator: apply the test in (3.86). In addition, the 
trace of p is 

TiuGo) = TrOIfi) = <'h 1 |'h 1 > = 1. (15.11) 

so p is a density matrix. Upon taking the trace of both sides of (15.9) over A, one 
obtains, see (15.8), the expression 

Pr(A j ) = Tr A (pAi) (15.12) 

for the probability of the property Af in agreement with (15.3). Note that |'h 1 ), 
the counterpart of l^i) in the discussion of the Born rule in Sec. 9.4, functions as a 
pre-probability, not as a physical property, and its partial trace p also functions as a 
pre-probability, which can be used to calculate probabilities for any sample space 
of the form (15.7). In the same way one can define the reduced density matrix 

p' = Tr^) (15.13) 

for system B and use it to calculate probabilities of various properties of system B. 

Let us consider a simple example. Let A and B be the spin spaces for two spin- 
half particles a and b, and let 

14b) = a\z„ ) 0 |z fc ) + P\z a ) 0 \zl >, 

where the subscripts identify the particles, and the coefficients satisfy 

l «| 2 + \ P \ 2 = 1- 

so that | 'hi) is normalized. The corresponding projector is 
'hi = l'hiX'h 1 | = |«| 2 |z+)(z+| 0 \zf)(zf\ + \P\ 2 \zf)(z~\ 0 \zt)(zp\ 

+ ap*\zl)(z u \ 0 \zf)(zt\+a*p\zf)(zt \ 0 I^X^I- 
The partial trace in (15.10) is easily evaluated by noting that 
Trs(|z,-}<4l) = <4l^>=0, 


(15.14) 

(15.15) 

(15.16) 


(15.17) 
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etc. ; thus 

P = W\\z^ + m\z- a l (15.18) 

This is a positive operator, since its eigenvalues are \a\ 2 and |/3| 2 , and its trace is 
equal to 1, (15.15). If both a and /3 are nonzero, p is a mixed state. 

Employing either (15.8) or (15. 12), one can show that if the decomposition [z+], 
[z“], the S az framework, is used as a sample space, the corresponding probabilities 
are | a | 2 and | 1 2 , whereas if one uses [x + ] , [x~ ] , the S ax framework, the probability 
of each is 1/2. Of course it makes no sense to suppose that these two sets of 
probabilities refer simultaneously to the same particle, as the two sample spaces 
are incompatible. Using either the S ax or the S az framework precludes treating 'Iq 
at t\ as a physical property when a and fi are both nonzero, since as a projector it 
does not commute with [w+] for any direction w. Thus 'T ] and its partial trace p 
should be thought of as pre-probabilities. 

Except when |a| 2 = |/3| 2 there is a unique basis, |z+), \z~), in which p is diago- 
nal. However, p can be used to assign a probability distribution for any basis, and 
thus there is nothing special about the basis in which it is diagonal. In this respect 
p differs from operators that represent physical variables, such as the Hamiltonian, 
for which the eigenfunctions do have a particular physical significance. 

The expression on the right side of (15.14) is an example of the Schmidt form 

I'I'l) = ® ^j) (15-19) 

j 

introduced in (6.18), where {|a 7 }} and [\bk)} are special choices of orthonormal 
bases for A and B. The reduced density matrices p and p' for A and B are easily 
calculated from the Schmidt form using (15.10) and (15.13), and one finds: 

p = p' = IM%]- (15.20) 

j j 

One can check that p in (15.18) is, indeed, given by this expression. 

Relative to the physical state of the subsystem A at time t\, p contains the same 
amount of information as 46. However, relative to the total system A 0 B, p is 
much less informative. Suppose that 

I B = J2 Bk (15.21) 

k 

is some decomposition of the identity for subsystem B, and we are interested in 
histories of the form 'To O (A ; <g) B k ). Then the joint probability distribution 

Pr(A ; A B k ) = Tr['I'i(A ; 0 B k )] 


(15.22) 
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can be calculated using 4^, whereas from p we can obtain only the marginal dis- 
tribution 

Pr (A j ) = Pv ( Aj A Bk )• (15.23) 

k 

The other marginal distribution, Pr (B k ), can be obtained using the reduced density 
matrix p' for subsystem B. However, from a knowledge of both p and p', one still 
cannot calculate the correlations between the two subsystems. For instance, in the 
two-spin example of (15.14), if we use a framework in which S az and Sb z are both 
defined at t\, 'Ifi implies that S az = —Sb z , a result which is not contained in p or 
p' . This illustrates the fact, pointed out in the introduction, that density matrices 
typically provide partial descriptions of quantum systems, descriptions from which 
certain features are omitted. 

Rather than a projector on a one-dimensional subspace, 'Pi could itself be a 
density matrix on A 0 B. For example, if the total quantum system with Hilbert 
space A 0 B <g) C consists of three subsystems A, B, and C, and unitary time 
evolution beginning with a normalized initial state |<E>o) at to results in a state |<J>i) 
with projector 4>i at t\ , then 

qq = Tr c (Oi) (15.24) 

is a density matrix. The partial traces of 'Pi, (15.10) and (15.13), again define 
density matrices p and p' appropriate for calculating probabilities of properties of 
A or B, since, for example, 

P = Tr^OPO = Tr BC (<fii) (15.25) 

can be obtained from 'Pi or directly from 4>i. Even when A <g) B is not part of 
a larger system it can be described by means of a density matrix as discussed in 
Sec. 15.6. 


15.4 Time dependence of reduced density matrix 

There is, of course, nothing very special about the time t\ used in the discussion in 
Sec. 15.3. If |'P r ) is a solution to the Schrodinger equation as a function of time t 
for the composite system A 0 B, and 'Pf the corresponding projector, then one can 
define a density matrix 

Pt = Tre('Pi) (15.26) 

for subsystem A at any time t, and use it to calculate the probability of a history 
of the form 'Po O A J based on the two times 0 and t, where A J is a projector on 
A. One should not think of p t as some sort of physical property which develops 
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in time. Instead, it is somewhat analogous to the classical single-time probability 
distribution p,(s ) at time t for a particle undergoing a random walk, or p f (r) for 
a Brownian particle, discussed in Sec. 9.2. In particular, p t provides no informa- 
tion about correlations of quantum properties at successive times. To discuss such 
correlations requires the use of quantum histories, see Sec. 15.5. 

In general, p t as a function of time does not satisfy a simple differential equation. 
An exception is the case in which A is itself an isolated subsystem, so that the time 
development operator for A <8> B factors, 

T(f, t ) = T A (t', t ) <g> T B (t', t), (15.27) 

or, equivalently, the Hamiltonian is of the form 

H = H a ®I + I®H b (15.28) 

during the times which are of interest. This would, for example, be the case if 
A and B were particles (or larger systems) flying away from each other after a 
collision. Using the fact that 

% = I^X^I = T(t, 0)'T 0 r(0, t), (15.29) 

one can show (e.g., by writing fl'o as a sum of product operators of the form P®Q) 
that when T ( t , 0) factors, (15.27), 

p, = T A (t,0)p 0 T A (0,t). (15.30) 

Upon differentiating this equation one obtains 

ih d ? =[H A .p,\. (15.31) 

dt 

since for an isolated system T A (t, 0) satisfies (7.45) and (7.46) with H A in place of 

H. Note that (15.31) is also valid when H A depends on time. If H A is independent 

of time and diagonal in the orthonormal basis {\e n )}, 

H A = J £ l B n \e n ){e„\, (15.32) 

one can use (7.48) to rewrite (15.30) in the form 

p t = e- itHAlh p 0 e itHA/h , (15.33) 

or the equivalent in terms of matrix elements: 

(e m \Pt\e n ) = (e m \po\e n )exp[-i(E m - E n )t/h\. (15.34) 

There are situations in which (15.28) is only true in a first approximation, and 
there is an additional weak interaction between A and B, so that A is not truly 
isolated. Under such circumstances it may still be possible, given a suitable system 
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B, to write an approximate differential equation for p, in which additional terms 
appear on the right side. A discussion of open systems of this type lies outside the 
scope of this book. 


15.5 Reduced density matrix as initial condition 

Let d'o be a projector representing an initial pure state at time to for the composite 
system A 0 B, and assume that for t > to the subsystem A is isolated from B, 
so that the time development operator factors, (15.27). We shall be interested in 
histories of the form 


Z“ = ^ 0 O Y a , (15.35) 

where 

Y a = A" 1 © A“ 2 O • • • A a / (15.36) 

is a history of A at the times t\ < t 2 < ■ ■ ■ tf, with t\ > to, and each of the 
projectors A 01 ! at time tj comes from a decomposition of the identity 

= ( 15 . 37 ) 

of subsystem A. A history of the form Z“ says nothing at all about what is going 
on in B after the initial time to, even though there might be nontrivial correlations 
between A and B. 

The Heisenberg chain operator for Z“, Sec. 1 1.4, using a reference time t r — to, 
can be written in the form 

K(Z a ) = [K A (Y a ) 0 /]*o. (15.38) 

where 

Ka(Y°) = A*/ ■ ■ ■ A“ 2 A“‘ (15.39) 

is the Heisenberg chain operator for 7“, considered as a history of A, with 

A 01 / = T A (t 0 , tj)A“ J T A {tj , to) (15.40) 

the Heisenberg counterpart of the Schrodinger operator A°j J , see (11.7). 

By first taking a partial trace over B, one can write the operator inner products 
needed to check consistency and calculate weights for the histories in (15.35) in 
the form 

(k(Z a ), K(Z “)> = Tr [4/o^(T“)^(7“)] 

= Tr . a [pk\(Y a )K A (Y«)\ = (K A (Y a ), k A (Y*)) p , (15.41) 
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where the operator inner product (, ) p is defined for any pair of operators A and A 
on A by 

(A,A) p :=T U (pA + A), (15.42) 

using the reduced density matrix 

P = TreC'I/o). (15.43) 

The definition (15.42) yields an inner product with all of the usual properties, in- 
cluding (A, A) p > 0, except that it might be possible (depending on p) for (A, A) p 
to vanish when A is not zero. 

The consistency conditions for the histories in (15.35) take the form 

(K A (Y a ), K A (Y*)) P = 0 for a # a, (15.44) 

and the probability of occurrence of Z“ or, equivalently, Y a is given by 

Pr(Z“) = Pr(7“) * ( K A (Y a ), K A (Y a )) p . (15.45) 

Thus as long as we are only interested in histories of the form (15.35) that make 
no reference at all to B (aside from the initial state 'I'o), the consistency conditions 
and weights can be evaluated with formulas which only involve A and make no 
reference to B. They are of the same form employed in Ch. 10, except for replacing 
the operator inner product (, ) defined in (10.12) by (, ) p defined in (15.42). It is 
also possible to write (15.44) and (15.45) using the Schrodinger chain operators 
K ( Y a ) in place of the Heisenberg operators K (7“), and this alternative form is 
employed in (15.48) and (15.50). 

If Al is a small system and B is large, the second trace in (15.41) will be much 
easier to evaluate than the first. Thus using a density matrix can simplify what 
might otherwise be a rather complicated problem. To be sure, calculating p from 
4>o using (15.43) may be a nontrivial task. However, it is often the case that 
is not known, so what one does is to assume that p has some form involving ad- 
justable parameters, which might, for example, be chosen on the basis of experi- 
ment. Thus even if one does not know its precise form, the very fact that p exists 
can assist in analyzing a problem. 

In the special case / = 1 in which the histories Y a involve only a single time t, 
and the consistency conditions (15.44) are automatically satisfied, the probability 
(15.45) can be written in the form (15.3), 

Pr(A ; , t) = Tt A (p t A*), (15.46) 

where p t is a solution of (15.31), or given by (15.33) in the case in which H A is 
independent of time. In this equation p, is functioning as a time-dependent pre- 
probability; see the comments at the beginning of Sec. 15.4. 
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15.6 Density matrix for isolated system 

It is also possible to use a density matrix p, thought of as a pre-probability, as 
the initial state of an isolated system which is not regarded as part of a larger, 
composite system. In such a case p embodies whatever information is available 
about the system, and this information does not have to be in the form of a particular 
property represented by a projector, or a probability distribution associated with 
some decomposition of the identity. As an example, the canonical density matrix 

p = e~ H/ke /Tr(e~ H/ke ), (15.47) 

where k is Boltzmann’s constant and H the time-independent Hamiltonian, is used 
in quantum statistical mechanics to describe a system in thermal equilibrium at an 
absolute temperature 9. While one often pictures such a system as being in contact 
with a thermal reservoir, and thus part of a larger, composite system, the density 
matrix (15.47) makes perfectly good sense for an isolated system, and a system of 
macroscopic size can constitute its own thermal reservoir. 

The formulas employed in Sec. 15.5 can be used, with some obvious modifi- 
cations, to check consistency and assign probabilities to histories of an isolated 
system for which p is the initial pre-probability at the time to- Thus for a family 
of histories of the form (15.36) at the times t\ < ti < •••?/, with t\ > to, the 
consistency condition takes the form 

(K(Y a ), K(Y a )) p = Tr [pK\Y a )K(Y a )] = 0 for a # a, (15.48) 

where the (Schrodinger) chain operator is defined by 

K(Y a ) = A a /T(t f , tj-i) • • • A“ 2 Tfe, t<)A* l T{tu to), (15.49) 

and the inner product {,) p is the same as in (15.42), except for omitting the sub- 
script on Tr. If the consistency conditions are satisfied, the probability of occur- 
rence of a history Y a is equal to its weight: 

W(Y a ) = (K(Y a ), K{Y a )) p = Tr[p^ t (y°)^(y“)]. (15.50) 

One could equally well use Heisenberg chain operators K in (15.48) and (15.50), 
as in the analogous formulas (15.44) and (15.45) in Sec. 15.5. Note that (15.48) 
and (15.50) are essentially the same as the corresponding formulas (10.20) and 
(10.11) in Ch. 10, aside from the presence of the density matrix p inside the trace 
defining the operator inner product (, ) p . 

In the special case of histories involving only a single time t > to and a decom- 
position of the identity I = ^ A-' at this time, consistency is automatic, and the 
corresponding probabilities take the form 

Pr( A ; , t) = Tr (p t A j ), 


(15.51) 
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or U\Pt\j) when A> — 1 7 } { / 1 is a projector on a pure state, where p, is a solution 
to the Schrodinger equation (15.31) with the subscript A omitted from H, or of 
the form (15.33) when the Hamiltonian H is independent of time. One should, 
however, not make the mistake of thinking that p t as a function of time represents 
anything like a complete description of the time development of a quantum system; 
see the remarks at the beginning of Sec. 15.4. In order to discuss correlations it is 
necessary to employ histories with two or more times following to- For these the 
consistency conditions (15.48) are not automatic, and probabilities must be worked 
out using (15.50). Both of these formulas require more information about time 
development than is contained in p t . 

There are also situations in which information about the initial state of an iso- 
lated system is given in the form of a probability distribution on a set of initial 
states, and an initial density matrix is generated from this probability distribution. 
The basic idea can be understood by considering a family of histories 

\.*t\ © [0f] (15-52) 

involving two times to and t\, where { | 0 ) } an( l (I 0 i)} are orthonormal bases, and 
the initial condition is that [V'q] occurs with probability pj. The probability that 
[<t >\ ] occurs at time t\ is given by 

Pr(0f) = ^Pr(^|^')/>;, (15.53) 

j 

where the conditional probabilities come from the Bom formula 

Pr (4>i I ^ 0 ) = \(4>i\T(h,t 0 Ml)\ 2 - (15.54) 

An alternative method for calculating Pr(<^f ) is to define a density matrix 

Po = Y,PjM (15-55) 

j 

at to using the initial probability distribution. Since each summand is a positive 
operator, the sum is positive, Sec. 3.9, and the trace of po is JA Pj = 1. Unlike the 
situations discussed previously, the eigenvalues of po are of direct physical signif- 
icance, since they are the probabilities of the initial distribution, and the eigenvec- 
tors are the physical properties of the system at to for this family of histories. Next, 
let 


Pi = T(t\, to)poT(to, h) (15.56) 

be the result of integrating Schrodinger’s equation, (15.31) with H in place of # 4 , 
from to to t\ . Then the probabilities (15.53) can be written as 

Pr (4>i) = Tr(piW>f]). 


(15.57) 
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In this expression the density matrix p\, in contrast to po, functions as a pre- 
probability, and its eigenvalues and eigenvectors have no particular physical sig- 
nificance. 

The expression (15.57) is more compact than (15.53), as it does not involve the 
collection of conditional probabilities in (15.54). On the other hand, the description 
of the quantum system provided by p\ is also less detailed. For example, one 
cannot use it to calculate correlations between the various initial and final states, or 
conditional probabilities such as 

Pr(V f o I 0f). (15-58) 

To be sure, a less detailed description is often more useful than one that is more 
detailed, especially when one is not interested in the details. The point is that 
a density matrix provides a partial description, and it is in principle possible to 
construct a more detailed description if one is interested in doing so. 


15.7 Conditional density matrices 

Suppose that at time to a particle A has interacted with a device B and is moving 
away from it, so that the two no longer interact, and assume that the projectors { B k } 
in the decomposition of the identity (15.21) for B represent some states of physical 
significance. Given that B is in the state B k at time to, what can one say about the 
future behavior of *4? For example, B might be a device which emits a spin-half 
particle with a spin polarization S v — +1/2, where the direction v depends on 
some setting of the device indicated by the index k of B k . 

The question of interest to us can be addressed using a family of histories of the 
form 


Z ka = B k O Y ka , (15.59) 

defined for the times to < h < • • • , where the Y ka are histories of A of the sort 
defined in (15.36), except that they are labeled with k as well as with a to allow for 
the possibility that the decomposition of the identity in (15.7) could depend upon 
k. (One could also employ a set of times h < h < ■ ■ ■ that depend on k.) 

Assume that the combined system A ® B is described at time to by an initial 
density matrix 'F 0 , which functions as a pre-probability. For example, + (l could 
result from unitary time evolution of an initial state defined at a still earlier time. 
Let 


Pk = Tr(^o B k ) 


(15.60) 


be the probability of the event B k . If pk is greater than 0, the kth conditional density 
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matrix is an operator on A defined by the partial trace 

p k = (l/p k )Tr B (y 0 B k ). (15.61) 

Each conditional density matrix gives rise to an inner product 

(A, A) k := Tr A (p k A'A) (15.62) 

of the form (15.42). 

Using the same sort of analysis as in Sec. 15.5, one can show that the family of 
histories (15.59) is consistent provided 

(K(Y ka ),K(Y k *)) k = 0fora (15.63) 

is satisfied for every k with p k > 0, where the Heisenberg chain operators K (Y ka ) 
are defined as in (15.39), but with the addition of a superscript k for each projector 
on the right side. Schrodinger chain operators could also be used, as in Sec. 15.6. 
Note that one does not have to check “cross terms” involving chain operators of 
histories with different values of k. If the consistency conditions are satisfied, the 
behavior of A given that B is in the state B k at to is described by the conditional 
probabilities 

Pr ( 7 to | B k ) _ (K(Y ka ), K(Y ka )) k . (15.64) 

The physical interpretation of the conditional density matrix is essentially the 
same as that of the simple density matrix p discussed in Sec. 15.5. Indeed, the 
latter can be thought of as a special case in which the decomposition of the identity 
of B in (15.21) consists of nothing but the identity itself. Note in particular that the 
eigenvalues and eigenvectors of p k play no (direct) role in its physical interpreta- 
tion, since p k functions as a pre-probability. 

Time-dependent conditional density matrices can be defined in the obviousway, 

p k = T A (t, t 0 )p k T A (t 0 , t ), (15.65) 

as solutions of the Schrodinger equation (15.31). One can use p k to calculate the 
probability of an event A in *4 at time t conditional upon B k , but not correlations 
between events in A at several different times. The comments about p, at the 
beginning of Sec. 15.4 also apply to p k . 

The simple or “unconditional” density matrix of A at time to, 

p = Trg('I'o), (15.66) 

is an average of the conditional density matrices: 

P = J2 Pkpk - 

k 


(15.67) 
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While p can be used to check consistency and calculate probabilities of histories in 
A which make no reference to B, for these purposes there is no need to introduce 
the refined family (15.59) in place of the coarser (15.35). To put it somewhat 
differently, the context in which the average (15.67) might be of interest is one in 
which p is not the appropriate mathematical tool for addressing the questions one 
is likely to be interested in. 

Let us consider the particular case in which = I ^ci) (fi'ol and the projectors 
B k = \b k )(b k \ (15.68) 

are pure states. Then one can expand I'l'o) in terms of the \b k ) in the form 

l*o> = £Vw|a*>®l**>, (15.69) 

k 

where pk was defined in (15.60). Inserting the coefficient Jpi in (15.69) means 
that the {|or*)} are normalized, (a k \a k ) — 1, but there is no reason to expect \a k ) 
and \a l ) to be orthogonal for k A l- The conditional density matrices are now pure 
states represented by the dyads 

p k = \a k )(a k \, (15.70) 

and (15.67) takes the form 

P = XI Pk\ ak )( ak \ = XI P^ 01 ^- (15.71) 

k k 

The expression (15.71) is sometimes interpreted to mean that the system A is 
in the state \a k ) with probability pk at time to- However, this is a bit misleading, 
because in general the \a k ) are not mutually orthogonal, and if two quantum states 
are not orthogonal to each other, it does not make sense to ask whether a system 
is in one or the other, as they do not represent mutually-exclusive possibilities; 
see Sec. 4.6. Instead, one should assign a probability pk at time to to the state 
\a k ) 0 | b k ) of the combined system A®B. Such states are mutually orthogonal 
because the \b k ) are mutually orthogonal. In general, \a k ) is an event dependent 
on | b k ) in the sense discussed in Ch. 14, so it does not make sense to speak of 
\a k | as a property of A by itself without making at least implicit reference to the 
state | b k ) of B. If one wants to ascribe a probability to \a k ) 0 \b k ), this ket or the 
corresponding projector must be an element of an appropriate sample space. The 
projector does not appear in (15.59), but one can insert it by replacing B k — \b k \ 
with \a k | 0 \b k \. The resulting collection of histories then forms the support of 
what is, at least technically, a different consistent family of histories. However, 
the consistency conditions and the probabilities in the new family are the same as 
those in the original family (15.59), so the distinction is of no great importance. 
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16.1 Some general principles 

There are some important differences between quantum and classical reasoning 
which reflect the different mathematical structure of the two theories. The most 
precise classical description of a mechanical system is provided by a point in the 
classical phase space, while the most precise quantum description is a ray or one- 
dimensional subspace of the Hilbert space. This in itself is not an important dif- 
ference. What is more significant is the fact that two distinct points in a classical 
phase space represent mutually exclusive properties of the physical system: if one 
is a true description of the sytem, the other must be false. In quantum theory, on the 
other hand, properties are mutually exclusive in this sense only if the correspond- 
ing projectors are mutually orthogonal. Distinct rays in the Hilbert space need not 
be orthogonal to each other, and when they are not orthogonal, they do not corre- 
spond to mutually exclusive properties. As explained in Sec. 4.6, if the projectors 
corresponding to the two properties do not commute with one another, and are 
thus not orthogonal, the properties are (mutually) incompatible. The relationship 
of incompatibility means that the properties cannot be logically compared, a situ- 
ation which does not arise in classical physics. The existence of this nonclassical 
relationship of incompatibility is a direct consequence of assuming (following von 
Neumann) that the negation of a property corresponds to the orthogonal comple- 
ment of the corresponding subspace of the Hilbert space; see the discussion in 
Sec. 4.6. 

Quantum reasoning is (at least formally) identical to classical reasoning when 
using a single quantum framework, and for this reason it is important to be aware 
of the framework which is being used to construct a quantum description or carry 
out quantum reasoning. A framework is a Boolean algebra of commuting projec- 
tors based upon a suitable sample space, Sec. 5.2. The sample space is a collec- 
tion of mutually orthogonal projectors which sum to the identity, and thus form a 
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decomposition of the identity. A sample space of histories must also satisfy the 
consistency conditions discussed in Ch. 10. 

In quantum theory there are always many possible frameworks which can be 
used to describe a given quantum system. While this situation can also arise in 
classical physics, as when one considers alternative coarse grainings of the phase 
space, it does not occur very often, and in any case classical frameworks are always 
mutually compatible, in the sense that they possess a common refinement. For rea- 
sons discussed in Sec. 16.4, compatible frameworks do not give rise to conceptual 
difficulties. By contrast, different quantum frameworks are generally incompati- 
ble, which means that the corresponding descriptions cannot be combined. As a 
consequence, when constructing a quantum description of a physical system it is 
necessary to restrict oneself to a single framework, or at least not mix results from 
incompatible frameworks. This single-framework rule or single-family rule has no 
counterpart in classical physics. Alternatively, one can say that in classical physics 
the single-framework rule is always satisfied, for reasons indicated in Sec. 26.6, so 
one never needs to worry about it. 

Quantum dynamics differs from classical Hamiltonian dynamics in that the lat- 
ter is deterministic: given a point in phase space at some time, there is a unique 
trajectory in phase space representing the states of the system at earlier or later 
times. In the quantum case, the dynamics is stochastic: even given a precise state 
of the system at one time, various alternatives can occur at other times, and the 
theory only provides probabilities for these alternatives. (Only in the exceptional 
case of unitary histories, see Secs. 8.7 and 10.3, is there a unique (probability 1) 
possibility at each time, and thus a deterministic dynamics.) Stochastic dynamics 
requires both the specification of an appropriate sample space or family of histories, 
as discussed in Ch. 8, and also a rule for assigning probabilities to histories. The 
latter, see Chs. 9 and 10, involves calculating weights for the histories using the 
unitary time development operators T(t', t), equivalent to solving Schrodinger’s 
equation, and then combining these with contingent data, typically an initial con- 
dition. Consequently, the reasoning process involved in applying the laws of quan- 
tum dynamics is somewhat different from that used for a deterministic classical 
system. 

Probabilities can be consistently assigned to a family of histories of an isolated 
quantum system using the laws of quantum dynamics only if the family is repre- 
sented by a Boolean algebra of projectors satisfying the consistency conditions dis- 
cussed in Ch. 10. A family which satisfies these conditions is known as a consistent 
family or framework. Each framework has its own sample space, and the single- 
framework rule says that the probabilities which apply to one framework cannot 
be used for a different framework, even for events or histories which are repre- 
sented in both frameworks. It is, however, often possible to assign probabilities to 
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elements of two or more distinct frameworks using the same initial data, as dis- 
cussed below. 

The laws of logic allow one to draw correct conclusions from some initial propo- 
sitions, or “data”, assuming the latter are correct. (Following the rules does not 
by itself always lead to the right answer; the principle of “garbage in, garbage out” 
was known to ancient logicians, though no doubt they worded it differently.) This is 
the sort of quantum reasoning with which we are concerned in this chapter. Given 
some facts or features of a quantum system, the “initial data”, what else can we say 
about it? What conclusions can we draw by applying the principles of quantum 
theory? For example, an atom is in its ground state and a fast muon passes by 1 
nm away: Will the atom be ionized? The “initial data” may simply be the initial 
state of the quantum system, but could also include information about what hap- 
pens later, as in the specific example discussed in Sec. 16.2. Thus “initial” refers 
to what is given at the beginning of the logical argument, not necessarily some 
property of the quantum system which occurred before something else that one is 
interested in. 

The first step in drawing conclusions from initial data consists in expressing 
the latter in proper quantum mechanical terms. In a typical situation the data are 
embedded in a sample space of mutually-exclusive possibilities by assigning prob- 
abilities to the elements of this space. This includes the case in which the initial 
data identify a unique element of the sample space that is assigned a probability of 
1, while all other elements have probability 0. If the initial data include information 
about the system at different times, the Hilbert space must, of course, be the Hilbert 
space of histories, and the sample space will consist of histories. See the example 
in Sec. 16.2. Initial data can also be expressed using a density matrix thought of as 
a pre-probability, see Sec. 15.6. Initial data which cannot be expressed in appro- 
priate quantum terms cannot be used to initiate a quantum reasoning process, even 
if they make good classical sense. 

Once the initial data have been embedded in a sample space, and probabili- 
ties have been assigned in accordance with quantum laws, the reasoning process 
follows the usual rules of probability theory. This means that, in general, the con- 
clusion of the reasoning process will be a set of probabilities, rather than a definite 
result. However, if a consequence can be inferred with probability 1, we call it 
“true”, while if some event or history has probability 0, it is “false”, always assum- 
ing that the initial data are “true”. 

It is worth emphasizing once again that the peculiarities of quantum theory do 
not manifest themselves as long as one is using a single sample space and the corre- 
sponding event algebra. Instead, they come about because there are many different 
sample spaces in which one can embed the initial data. Hence the conclusions one 
can draw from those data depend upon which sample space is being used. This 
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multiplicity of sample spaces poses some special problems for quantum reason- 
ing, and these will be discussed in Secs. 16.3 and 16.4, after considering a specific 
example in the next section. 

There are many other sorts of reasoning which go on when quantum theory 
is applied to a particular problem; e.g., the correct choice of boundary condi- 
tions for solving a differential equation, the appropriate approximation to be em- 
ployed for calculating the time development, the use of symmetries, etc. These 
are not included in the present discussion because they are the same as in classical 
physics. 


16.2 Example: Toy beam splitter 

Consider the toy beam splitter with a detector in the c output channel shown in 
Fig. 12.2 on page 166 and discussed in Sec. 12.2. Suppose that the initial state at 
t — 0 is |0a, Oc): the particle in the a entrance channel to the beam splitter, and 
the detector in its Oc “ready” state. Also suppose that at t — 3 the detector is in its 
lc state indicating that the particle has been detected. These pieces of information 
about the system at t — 0 and t — 3 constitute the initial data as that term was 
defined in Sec. 16.1. We shall also make use of a certain amount of “background” 
information: the structure of the toy model and its unitary time transformation, as 
found in Sec. 12.2. 

In order to draw conclusions from the initial data, they must be embedded in an 
appropriate sample space. A useful approach is to begin with a relatively coarse 
sample space, and then refine it in different ways depending upon the sorts of 
questions one is interested in. One choice for the initial, coarse sample space is the 
set of histories 


X* = [Oa, Oc] O / O / O [lc], 

X° = [Oa, Oc] O / O / O [Oc], (16.1) 

X z = R 0/0/0/ 

for the times t = 0, 1, 2, 3, where R = I — [Oa, Oc]. Here the superscript * stands 

for the triggered and o for the ready state of the detector at t = 3. The sum of 

these projectors is the history identity /, and it is easy to see that the consistency 
conditions are satisfied in view of the orthogonality of the initial and final states, 
Sec. 1 1.3. Since X* is the only member of (16.1) consistent with the initial data, it 
is assigned probability 1, and the others are assigned probability 0. 

Where was the particle at t = 1? The histories in (16.1) tell us nothing about 
any property of the system at t — 1, since the identity I is uninformative. Thus in 
order to answer this question we need to refine the sample space. This can be done 
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by replacing X* with the three history projectors 


X* c = [0 a, Oc] O [lc] O/O [lc], 

X* d = [0a,0c]O [Id] 0/0 [lc], (16.2) 

X*p = [Oa, Oc] O P 0/0 [lc], 
whose sum is X*, where 

P = I- [lc] - [Id] (16.3) 

is the projector for the particle to be someplace other than sites lc or Id. The 
weights of X* d and X* p are 0, given the dynamics as specified in Sec. 12.2. The 
history X° can be refined in a similar way, and the weights of X oc and X° p are 0. 
(We shall not bother to refine X z , though this could also be done if one wanted to.) 
Consistency is easily checked. 

When one refines a sample space, the probability associated with each of the 
elements of the original space is divided up among their replacements in proportion 
to their weights, as explained in Sec. 9.1. Consequently, in the refined sample 
space, X* c has probability 1, and all the other histories have probability 0. Note 
that while X* d and X* p are consistent with the initial data, the fact that they have 
zero weight (are dynamically impossible) means that they have zero probability. 
From this we conclude that the initial data imply that the particle has the property 
[lc], meaning that it is at the site lc, at t — 1. That is, [lc] at t — 1 is true if one 
assumes the initial data are true. 

Given the same initial data, one can ask a different question: At t = 1, was the 
particle in one or the other of the two states 

|ld> = (|lc> + |ld>)/V2, \\b) = {-\\c) + \\d))/j2 (16.4) 

resulting from the unitary evolution of |0a) and \0b) (see (12.2))? To answer this 
question, we use an alternative refinement of the sample space (16.1), in which X* 
is replaced with the three histories 

X* a - [Oa, Oc] © [la] 0/0 [lc], 

X* b = [0a,0c]0[l£]0/0[lc], (16.5) 

X* p = [Oa, Oc] © P 0/0 [lc], 

with P again given by (16.3). (Note that [la] + [\b] is the same as [lc] + [1 d].) A 
similar refinement can be carried out for X°. Both X* b and X* p have zero weight, 
so the initial data imply that the history X* a has probability 1. Consequently, we 
can conclude that the particle is in the superposition state [la] with probability 1 
at t = 1. That is, [la] at t — 1 is true if one assumes the initial data are true. 
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However, the family which includes (16.5) is incompatible with the one which 
includes (16.2), as is obvious from the fact that [lc] and [Id] do not commute with 
each other. Hence the probability 1 (true) conclusion obtained using one family 
cannot be combined with the probability 1 conclusion obtained using the other 
family. We cannot deduce from the initial data that at t — 1 the particle was in 
the state [lc] and also in the state [Id], for this is quantum nonsense. Putting 
together results from two incompatible frameworks in this way violates the single- 
framework rule. So which is the correct family to use in order to work out the real 
state of the particle at t — 1: should one employ (16.2) or (16.5)? This is not a 
meaningful question in the context of quantum theory, for reasons which will be 
discussed in Sec. 16.4. 

Now let us ask a third question based on the same initial data used previously. 
Where was the particle at t — 2: was it at 2c or at 2d? The answer is obvious. All 
we need to do is to replace (16.2) with a different refinement 

X* c ' = \0a, Oc] © / © [2c] © [lc], 

X* d ' = [Oa, Oc] 0/0 [2d] O [lc], (16.6) 

X*p' = [Oa, Oc] © I O P' O [lc], 

with P' — I — [2c] — [2d]. Since X* c ' has probability 1, it is certain, given the 
initial data, that the particle was at 2c at t — 2. 

The same answer can be obtained starting with the sample space which includes 
the histories in (16.2), and refining it to include the history 

X* cc = [Oa, Oc] O [lc] O [2c] © [lc], (16.7) 

which has probability 1, along with additional histories with probability 0. In the 
same way, one could start with the sample space which includes the histories in 
(16.5), and refine it so that it contains 

X* ac = [Oa, Oc] O [la] © [2c] © [lc], (16.8) 

whose probability (conditional upon the initial data) is 1, plus others whose prob- 
ability is 0. It is obvious that the sample space containing (16.7) is incompatible 
with that containing (16.8), since these two history projectors do not commute with 
each other. Nonetheless, either family can be used to answer the question “Where 
is the particle at t — 22”, and both give precisely the same answer: the initial data 
imply that it is at 2c, and not someplace else. 
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16.3 Internal consistency of quantum reasoning 

The example in Sec. 16.2 illustrates the principles of quantum reasoning intro- 
duced in Sec. 16.1. It also exhibits some important ways in which reasoning about 
quantum systems differs from what one is accustomed to in classical physics. In 
deterministic classical mechanics one is used to starting from some initial state and 
integrating the equations of motion to produce a trajectory in which at each time 
the system is described by a single point in its phase space. Given this trajectory 
one can answer any question of physical interest such as, for example, the time 
dependence of the kinetic energy. 

In quantum theory one typically (unitary histories are an exception) uses a rather 
different strategy. Instead of starting with a single well-defined temporal develop- 
ment which can answer all questions, one has to start with the physical questions 
themselves and use these questions to generate an appropriate framework in which 
they make sense. Once this framework is specified, the principles of stochastic 
quantum dynamics can be brought to bear in order to supply answers, usually in 
the form of probabilities, to the questions one is interested in. 

One cannot use a single framework to answer all possible questions about a 
quantum system, because answering one question will require the use of a frame- 
work that is incompatible with another framework needed to address some other 
question. But even a particular question can often be answered using more than one 
framework, as illustrated by the third (last) question in Sec. 16.2. This multiplicity 
of frameworks, along with the rule which requires that a quantum description, or 
the reasoning from initial data to a conclusion, use only a single framework, raises 
two somewhat different issues. The first issue is that of internal consistency: if 
many frameworks are available, will one get the same answer to the same question 
if one works it out in different frameworks? We shall show that this is, indeed, the 
case. The second issue, discussed in the next section, is the intuitive significance 
of the fact that alternative incompatible frameworks can be employed for one and 
the same quantum system. 

The internal consistency of quantum reasoning can be shown in the following 
way. Assume that T \ , Ti, .. .T n are different consistent families of histories, 
which may be incompatible with one another, each of which contains the initial 
data and the other events, or histories, that are needed to answer a particular physi- 
cal question. Each framework is a set of projectors which forms a Boolean algebra, 
and one can define T to be their set- theoretic intersection: 


T = r\TiC\---T n . (16.9) 

That is, a projector Y is in T if and only if it is also in each Tj, for 1 < j < n. 
It is straightforward to show that T is a Boolean algebra of commuting history 
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projectors: It contains the history identity /; if it contains a projector Y, then it 
also contains its negation I — Y; and if it contains Y and Y ' , then it also contains 
YY' — Y'Y . These assertions follow at once from the fact that they are true of each 
of the Tj. Furthermore, the fact that each Tj is a consistent family means that T 
is consistent; one can use the criterion in (10.21). 

Since each Tj contains the projectors needed to represent the initial data, along 
with those needed to express the conclusions one is interested in, the same is true of 
T. Consequently, the task of assigning probabilities using the initial data together 
with the dynamical weights of the histories, and then using probabilistic arguments 
to reach certain conclusions, can be carried out in T. But since it can be done in 
T, it can also be done in an identical fashion in any of the Tj, as the latter contains 
all the projectors of T. Furthermore, any history in T will be assigned the same 
weight in T and in any Tj, since the weight W (T) is defined directly in terms of the 
history projector Y using a formula, (10.1 1), that makes no reference to the family 
which contains the projector. Consequently, the conclusions one draws from initial 
data about physical properties or histories will be identical in all frameworks which 
contain the appropriate projectors. 

This internal consistency is illustrated by the discussion of the third (last) ques- 
tion in Sec. 16.2: T is the family based on the sample space containing (16.6), and 
T\ and Ti are two mutually incompatible refinements containing the histories in 
(16.7) and (16.8), respectively. One can use either T\ or T% to answer the question 
“Where is the particle at t — 2?”, and the answer is the same. 

As well as providing a proof of consistency, the preceding remarks suggest a 
certain strategy for carrying out quantum reasoning of the type we are concerned 
with: Use the smallest, or coarsest framework which contains both the initial data 
and the additional properties of interest in order to analyze the problem. Any other 
framework which can be used for the same purpose will be a refinement of the 
coarsest one, and will give the same answers, so there is no point in going to extra 
effort. If one has some specific initial data in mind, but wants to consider a variety 
of possible conclusions, some of which are incompatible with others, then start off 
with the coarsest framework £ which contains all the initial data, and refine it in 
the different ways needed to draw different conclusions. 

This was the strategy employed in Sec. 16.2, except that the coarsest sample 
space that contains the initial data X* consists of the two projectors X* and 1 — X*, 
whereas we used a sample space (16.1) containing three histories rather than just 
two. One reason for using X° and X z in this case is that each has a straightforward 
physical interpretation, unlike their sum I — X*. The argument for consistency 
given above shows that there is no harm in using a more refined sample space 
as a starting point for further refinements, as long as it allows one to answer the 
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questions one is interested in, for in the end one will always get precisely the same 
answer to any particular question. 


16.4 Interpretation of multiple frameworks 

The example of Sec. 16.2 illustrates a situation which arises rather often in rea- 
soning about quantum systems. The initial data V can be used in various different 

frameworks T\ , Ti, . . . , to yield different conclusions C\,Ci, The question 

then arises as to the relationship among these different conclusions. In particular, 
can one say that they all apply simultaneously to the same physical system? Gen- 
erally the conclusions are expressed in terms of probabilities that are greater than 
0 and less than 1, and thus involve some uncertainty. But sometimes, and we de- 
liberately focused on this situation in the example in Sec. 16.2, one concludes that 
an event (or history) has probability 1, in which case it is natural to interpret this 
as meaning that the event actually occurs, or is a “true” consequence of the initial 
data. Similarly, probability 0 can be interpreted to mean that the event does not 
occur, or is “false”. 

If two or more frameworks are compatible, there is nothing problematical in 
supposing that the corresponding conclusions apply simultaneously to the same 
physical system. The reason is that compatibility implies the existence of a com- 
mon refinement, a framework Q which contains the projectors necessary to describe 
the initial data and all of the conclusions. The consistency of quantum reasoning, 
Sec. 16.3, means that the conclusions Cj will be identical in T j and in Q. Conse- 
quently one can think of T\ , Ti, . . . as representing alternative “views” or “per- 
spectives” of the same physical system, much as one can view an object, such as a 
teacup, from various different angles. Certain details are visible from one perspec- 
tive and others from a different perspective, but there is no problem in supposing 
that they all form part of a single correct description, or that they are all simultane- 
ously true, for the object in question. 

In the example considered in Sec. 16.2, T\ could be the framework based on 
(16.2), which allows one to describe the position of the particle at t = 1, but not 
for any other t > 0, and Ti the one based on (16.6), which provides a description 
of the position of the particle at t — 2, but not at t = 1 . Their co mm on refinement 
provides a description of the position of the particle at t — 1 and t — 2, and T\ and 
can be thought of as supplying complementary parts of this description. 

Conceptual difficulties arise, however, when two or more frameworks are incom- 
patible. Again with reference to the example in Sec. 16.2, let JT 3 be the framework 
based on (16.5). It is incompatible with T\, because X* c in (16.2) and X* a in 
(16.5) do not commute with each other, since the projectors [lc] and [Id] at t — 1 
do not commute. From the initial data one can conclude using J-\ that the particle 
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possesses the property [lc] at t = 1 with probability 1. Using +/ and the same ini- 
tial data, one concludes that the particle has the property [Id] at t — 1 , again with 
probability 1. But even though [lc] and [Id] are both “true” (probability 1) con- 
sequences of the initial data, one cannot think of them as representing properties 
of the particle which are simultaneously true in the same sense one is accustomed 
to when thinking about classical systems, for there is no property corresponding 
to [lc] AND [Id], just as there is no property corresponding to S z — +1/2 AND 
S x = +1/2 for a spin-half particle. 

The conceptual difficulty goes away if one supposes that the two incompatible 
frameworks are being used to describe two distinct physical systems that are de- 
scribed by the same initial data, or the same system during two different runs of 
an experiment. In the case of two separate but identical systems, each with Hilbert 
space H, the combination is described by a tensor product and employing 

T\ for the first and +3 for the second is formally the same as a single consistent 
family for the combination. This is analogous to the fact that while S z — +1/2 
AND S x = +1/2 for a spin-half particle is quantum nonsense, there is no problem 
with the statement that S z — +1/2 for one particle and S x — +1/2 for a different 
particle. In the same way, different experimental runs for a single system must oc- 
cur during different intervals of time, and the tensor product 0. O 0. of two history 
Hilbert spaces plays the same role as Tt®Tl for two distinct systems. 

Incompatible frameworks do give rise to conceptual problems when one tries to 
apply them to the same system during the same time interval. To be sure, there 
is never any harm in constructing as many alternative descriptions of a quantum 
system as one wants to, and writing them down on the same sheet of paper. The 
difficulty comes about when one wants to think of the results obtained using incom- 
patible frameworks as all referring simultaneously to the same physical system, or 
tries to combine the results of reasoning based upon incompatible frameworks. It 
is this which is forbidden by the single-framework rule of quantum reasoning. 

Note, by the way, that in view of the internal consistency of quantum reasoning 
discussed in Sec. 16.3, it is never possible, even using incompatible frameworks, 
to derive contradictory results starting from the same initial data. Thus for the 
example in Sec. 16.2, the fact that there is a framework in which one can conclude 
with certainty that the particle is at the site lc at t — 1 means there cannot be 
another framework in which one can conclude that the particle is someplace else at 
t — 1, or that it can be at site lc with some probability less than 1. Any framework 
which contains both the initial data and the possibility of discussing whether the 
particle is or is not at the site lc at t — 1 will lead to precisely the same conclusion 
as + 1 . This does not contradict the fact that in +3 the particle is predicted to be 
in a state [la] at t — 1: +3 does not contain [lc], and thus in this framework one 
cannot address the question of whether the particle is at the site lc at t = 1 . 
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Even though the single-framework rule tells us that the result [lc] from frame- 
work T\ and the result [Id] from JE, cannot be combined or compared, this state of 
affairs is intuitively rather troubling, for the following reason. In classical physics 
whenever one can draw the conclusion through one line of reasoning that a system 
has a property P, and through a different line of reasoning that it has the property 
Q, then it is correct to conclude that the system possesses both properties simulta- 
neously. Thus if P is true (assuming the truth of some initial data) and Q is also 
true (using the same data), then it is always the case that P AND Q is true. By 
contrast, in the case we have been discussing, [lc] is true (a correct conclusion 
from the data) in T\, [Id] is true if we use JE 3 , while the combination [lc] AND 
[Id] is not even meaningful as a quantum property, much less true! 

When viewed from the perspective of quantum theory, see Ch. 26, classical 
physics is an approximation to quantum theory in certain circumstances in which 
the corresponding quantum description requires only a single framework (or, which 
amounts to the same thing, a collection of compatible frameworks). Thus the prob- 
lem of developing rules for correct reasoning when one is confronted with a mul- 
tiplicity of incompatible frameworks never arises in classical physics, or in our 
everyday “macroscopic” experience which classical physics describes so well. But 
this is precisely why the rules of reasoning which are perfectly adequate and quite 
successful in classical physics cannot be depended upon to provide reliable con- 
ceptual tools for thinking about the quantum domain. However deep-seated may 
be our intuitions about the meaning of “true” and “false” in the classical realm, 
these cannot be uncritically extended into quantum theory. 

As probabilities can only be defined once a sample space has been specified, 
probabilistic reasoning in quantum theory necessarily depends upon the sample 
space and its associated framework. As a consequence, if “true” is to be iden- 
tified with “probability 1”, then the notion of “truth” in quantum theory, in the 
sense of deriving true conclusions from initial data that are assumed to be true, 
must necessarily depend upon the framework which one employs. This feature 
of quantum reasoning is sometimes regarded as unacceptable because it is hard 
to reconcile with an intuition based upon classical physics and ordinary everyday 
experience. But classical physics cannot be the arbiter for the rules of quantum 
reasoning. Instead, these rules must conform to the mathematical structure upon 
which quantum theory is based, and as has been pointed out repeatedly in previ- 
ous chapters, this structure is significantly different from that of a classical phase 
space. To acquire a good “quantum intuition”, one needs to work through vari- 
ous quantum examples in which a system can be studied using different incom- 
patible frameworks. Several examples have been considered in previous chapters, 
and there are some more in later chapters. I myself have found the example of 
a beam splitter insider a box, Fig. 18.3 on page 253, particularly helpful. For 



16.4 Interpretation of multiple frameworks 227 

additional comments on multiple incompatible frameworks, see Secs. 18.4 and 
27.3. 
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17.1 Introduction 

I place a tape measure with one end on the floor next to a table, read the height 
of the table from the tape, and record the result in a notebook. What are the 
essential features of this measurement process ? The key point is the establish- 
ment of a correlation between a physical property (the height) of a measured sys- 
tem (the table) and a suitable record (in the notebook), which is itself a physical 
property of some other system. It will be convenient in what follows to think 
of this record as part of the measuring apparatus, which consists of everything 
essential to the measuring process apart from the measured system. Human be- 
ings are not essential to the measuring process. The height of a table could be 
measured by a robot. In the modem laboratory, measurements are often carried 
out by automated equipment, and the results stored in a computer memory or on 
magnetic tape, etc. While scientific progress requires that human beings pay atten- 
tion to the resulting data, this may occur a long time after the measurements are 
completed. 

In this and the next chapter we consider measurements as physical processes 
in which a property of some quantum system, which we shall usually think of as 
some sort of “particle”, becomes correlated with the outcome of the measurement, 
itself a property of another quantum system, the “apparatus”. Both the measured 
system and the apparatus which carries out the measurement are to be thought of 
as parts of a single closed quantum mechanical system. This makes it possible to 
apply the principles of quantum theory developed in earlier chapters. There are 
no special principles which apply to measurements in contrast to other quantum 
processes. We need an appropriate Hilbert space for the measured system plus 
apparatus, some sort of initial state, unitary time development operators, and a 
suitable framework or consistent family of histories. There are, as always, many 
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possible frameworks. A correct quantum description of the measuring process 
must employ a single framework; mixing results from incompatible frameworks 
will only cause confusion. 

In practice it is necessary to make a number of idealizations and approximations 
in order to discuss measurements as quantum mechanical processes. This should 
not be surprising, for the same is true of classical physics. For example, the mo- 
tion of the planets in the solar system can be described to quite high precision by 
treating them as point masses subject to gravitational forces, but of course this is 
not an exact description. The usual procedure followed by a physicist is to first 
work out an approximate description of some situation in order to get an idea of 
the various magnitudes involved, and then see how this first approximation can 
be improved, if greater precision is needed, by including effects which have been 
ignored. We shall follow this approach in this and the following chapter, some- 
times pointing out how a particular approximation can be improved upon, at least 
in principle. The aim is physical insight, not a precise formalism which will cover 
all cases. 

Quantum measurements can be divided into two broad categories: nondestruc- 
tive and destructive. In nondestructive measurements, also called nondemolition 
measurements, the measured property is preserved, so the particle has the same, 
or almost the same property after the measurement is completed as it had be- 
fore the measurement. While it is easy to make nondestructive measurements on 
macroscopic objects, such as tables, nondestructive measurements of microscopic 
quantum systems are much more difficult. Even when a quantum measurement is 
nondestructive for a particular property, it will be destructive for many other prop- 
erties, so that the term nondestructive can only be defined relative to some prop- 
erty or properties, and does not refer to all conceivable properties of the quantum 
system. 

In destructive measurements the property of interest is altered during the mea- 
surement process, often in an uncontrolled fashion, so that after the measurement 
the particle no longer has this property. For example, the kinetic energy of an en- 
ergetic particle can be measured by bringing it to rest in a scintillator and finding 
the amount of light produced. This tells one what the energy of the particle was 
before it entered the scintillator, whereas at the end of the measurement process 
the kinetic energy of the particle is zero. In this and other examples of destructive 
measurements it is clear that the correlation of interest is between a property the 
particle had before the measurement took place, and the state of the apparatus after 
the measurement, and thus involves properties at two different times. The absence 
of a systematic way of treating correlations involving different times, except in 
very special cases, is the basic reason why the theory of measurement developed 
by von Neumann, Sec. 18.2, is not very satisfactory. 
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17.2 Microscopic measurement 

The measurement of the spin of a spin-half particle illustrates many of the princi- 
ples of the quantum theory of measurement, so we begin with this simple case, us- 
ing a certain number of approximations to keep the discussion from becoming too 
complicated. Consider a neutral spin-half particle, e.g., a silver atom in its ground 
state, moving through the inhomogeneous magnetic field of a Stem-Gerlach appa- 
ratus, shown schematically in Fig. 17.1. We shall assume the magnetic field is such 
that if the z -component S z of the spin is + 1 /2, there is an upwards force on the par- 
ticle, and it emerges from the magnet moving upwards, whereas if S z = — 1 /2, the 
force is in the opposite direction, and the particle moves downward as it leaves the 
magnet. 



Fig. 17.1. Spin-half particle passing through a Stern-Gerlach magnet. 

This can be described in quantum mechanical terms as follows. The spin states 
of the particle corresponding to S z — ±1/2 are |z + ) and |z“) in the notation of 
Sec. 4.2. Let to and t\ be two successive times preceding the moment at which 
the particle enters the magnetic field, see Fig. 17.1, and t 2 a later time after it has 
emerged from the magnetic field. Assume that the unitary time development from 
to to t\ to ?2 is given by 

|z + )|m) h* \z + )\co') h* |z + }|co + ), 

where | to), \a)'), \co + ), |&>“) are wave packets for the particle’s center of mass, at 
the locations indicated in Fig. 17.1. (One could also write these as o>(r), etc.) 

One can think of the center of mass of the particle as the “apparatus”. The 
two possible outcomes of the measurement are that the particle emerges from the 
magnet in one of the two spatial wave packets \co + ) or \a>~). It is important that the 
outcome wave packets be orthogonal, 

{(o + \uT) — 0, (17.2) 

as otherwise we cannot speak of them as mutually-exclusive possibilities. This 
condition will be fulfilled if the wave packets have negligible overlap, as suggested 
by the sketch in Fig. 17.1. 
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In calculating the unitary time development in (17.1) we assume that the Hamil- 
tonian for the particle includes an interaction with the magnetic field, and this field 
is assumed to be “classical”; that is, it provides a potential for the particle’s mo- 
tion, but does not itself need to be described using an appropriate field-theoretical 
Hilbert space. Similarly, we have omitted from our quantum description the atoms 
of the magnet which actually produce this magnetic field. These “inert” parts of the 
apparatus could, in principle, be included in the sort of quantum description dis- 
cussed in Sec. 17.3, but this is an unnecessary complication, since their essential 
role is included in the unitary time development in (17.1). 

The process shown in Fig. 17.1 can be thought of as a measurement because 
the value of S z before the measurement, the property being measured, is correlated 
with the spatial wave packet of the particle after the measurement, which forms the 
output of the measurement. It is also the case that S z before the measurement is 
correlated with its value after the measurement, and this means the measurement is 
nondestructive for the properties S z — ± 1 /2. One can easily imagine a destructive 
version of the same measurement by supposing that the wave packets emerging 
from the field gradient of the main magnet pass through some regions of uniform 
magnetic field, which do not affect the center of mass motion, but do cause a pre- 
cession of the spin. Consequently, at the end the process the location of the wave 
packet for the center of mass will still serve to indicate the value of S z before the 
measurement began, even though the final value of S z need not be the same as the 
initial value. 

Suppose that the initial spin state is not one of the possibilities S z = ±1/2, but 
instead 


k + > = (k + > + Iz">)/V2 (17.3) 

corresponding to S x = +1/2. What happens during the measuring process? The 
unitary time development of the initial state 

lifo) = l* + )M = (k + )M + r>M)/V2 ( 17 . 4 ) 

is obtained by taking a linear combination of the two cases in (17.1): 

Wo) ^ \x + )\(o') (|z + )|m + > + |0|m->)/V2. (17.5) 

The unitary history in (17.5) cannot be used to describe the measuring process, 
because the measurement outcomes, \co + ) and | co~), are clearly incompatible with 
the final state in (17.5). A quantum mechanical description of a measurement with 
particular outcomes must, obviously, employ a framework in which these outcomes 
are represented by appropriate projectors, as in the consistent family whose support 
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consists of the two histories 

[V^OU+M© \ [z+][co+l (17.6) 

[ [z ][co ]. 

The notation, see (14.12), indicates that the two histories are identical at the times 
to and t], but contain different events at t 2 . While this family contains the measure- 
ment outcomes [&> + ] and [&>“], it is still not satisfactory for discussing the process 
in Fig. 17. 1 as a measurement, because it does not allow us to relate these outcomes 
to the spin states [z + ] and [z~ I of the particle before the measurement took place. 
Since the properties S z — ±1/2 are incompatible with a spin state [x + ] at t\, (17.6) 
does not allow us to say anything about S z before the particle enters the magnetic 
field gradient. It is true that S z at h is correlated with the measurement outcome if 
we use (17.6). But this would also be true if the apparatus had somehow produced 
a particle in a certain spin state without any reference to its previous properties, 
and calling that a “measurement” would be rather odd. 

A more satisfactory description of the process in Fig. 17.1 as a measurement is 
obtained by using an alternative consistent family whose support is the two histo- 
ries 


r , [ U + 1M] O U + ][« + L 
1A0 O \ 

|k1K]0[z1M. 

As both histories have positive weights, one sees that 

Pr( z + | m+) = 1 = Pr(z/ \co 2 ). 
Pr(cu+ | z+) = 1 = Pr(o >2 \zf), 


(17.7) 


(17.8) 


where we follow our usual convention that square brackets can be omitted and 
subscripts refer to times: e.g., z\ is the same as \z + ] i and means S z = +1/2 at 
t\ . (In addition, the initial state i/Aj could be included among the conditions, but, as 
usual, we omit it.) These conditional probabilities tell us that if the measurement 
outcome is a> + at t 2 , we can be certain that the particle had S z = +1/2 at fi, and 
vice versa; likewise, a>~ at t 2 implies S z — — 1 /2 at t\ . (For an initial spin state 
|z+) the conditional probabilities involving and a>~ are undefined, and those 
involving z + and &> + are undefined for an initial |z - ).) 

It is (17.8) which tells us that what we have been referring to as a measurement 
process actually deserves that name, for it shows that the result of this process is a 
correlation between specific outcomes and appropriate properties of the measured 
system before the measurement took place. If these probabilities were slightly less 
than 1, it would still be possible to speak of an approximate measurement, and in 
practice all measurements are to some degree approximate. 
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In conclusion it is worth emphasizing that in order to describe a quantum pro- 
cess as a measurement it is necessary to employ a framework which includes both 
the measurement outcomes (pointer positions) and the properties of the measured 
system before the measurement took place, by means of suitable projectors. These 
requirements are satisfied by (17.7), whereas (17.6), even though it is an improve- 
ment over a unitary family, cannot be used to derive the correlations (17.8) that are 
characteristic of a measurement. 


17.3 Macroscopic measurement, first version 

If the results are to be of use to scientists, measurements of the properties of mi- 
croscopic quantum systems must eventually produce macroscopic results visible to 
the eye or at least accessible to the computer. This requires devices that amplify 
microscopic signals and produce some sort of macroscopic record. These pro- 
cesses are thermodynamically irreversible, and this irreversibility contributes to the 
permanence of the resulting records. Thus even though the production of certain 
correlations, which is the central feature of the measuring process, can occur on a 
microscopic scale, as discussed in the previous section, macroscopic systems must 
be taken into account when quantum theory is used to describe practical measure- 
ments. A full and detailed quantum mechanical description of the processes going 
on in a macroscopic piece of apparatus containing 10 23 particles is obviously not 
possible. Nonetheless, by making a certain number of plausible assumptions it is 
possible to explore what such a description might contain, and this is what we shall 
do in this and the next section, for a macroscopic version of the measurement of 
the spin of a spin-half particle. 

Once again, assume that the particle passes through a magnetic field gradient, 
Fig. 17.1, which splits the center of mass wave packet into two pieces which are 
eventually separated by a macroscopic distance. The macroscopic measurement is 
then completed by adding particle detectors to determine whether the particle is in 
the upper or lower beam as it leaves the magnetic field. One could, for example, 
suppose that light from a laser ionizes a silver atom as it travels along one of the 
paths emerging from the apparatus, and the resulting electron is accelerated in an 
electric field and made to produce a macroscopic current by a cascade process. 
Detection of single atoms in this fashion is technically feasible, though it is not 
easy. Of course, one must expect that in such a measurement process the spin 
direction of the atom will not be preserved; indeed, the atom itself is broken up by 
the ionization process. Hence such a measurement is destructive. 

Let us assume once again that three times to, t\, and t 2 are used in a quantum 
description of the measurement process. The times to and t\ precede the entry of 
the particle into the magnetic field, Fig. 17.1, whereas t 2 is long enough after the 
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particle has emerged from the magnetic field to allow its detection, and the result 
indicating the channel in which it emerged to be recorded in some macroscopic de- 
vice, say a pointer easily visible to the naked eye. Assume that before the measure- 
ment takes place the pointer points in a horizontal direction, and at the completion 
of the measurement it either points upwards, indicating that the particle emerged 
in the upper channel corresponding to S z = +1/2, or downwards, indicating that 
the particle emerged in the S z — —1/2 channel. Of course, no one would build 
an apparatus in this fashion nowadays, but when discussing conceptual questions 
there is an advantage in using something easily visualized, rather than the direction 
of magnetization in some region on a magnetic tape or disk. The principles are in 
any case the same. 

As a first attempt at a quantum description of such a macroscopic measurement, 
assume that at to the apparatus plus the center of mass of the particle whose spin is 
to be measured is in a quantum state |£2). Then we might expect that the unitary 
time development of the apparatus plus particle would be similar to (17.1), that is, 
of the form 


£2) |z + }|£2'} |£2 + ), 

£2) |z“)|£2') |£T), 


(17.9) 


where |£2+) is some state of the apparatus in which the pointer points upwards, 
and |£2“) a state in which the pointer points downwards. The difference between 
| Q) and |£2'} reflects both the fact that the position of the center of mass of the 
particle changes between to and t\, and that the apparatus itself is evolving in time. 
The only assumption we have made is that this time evolution is not influenced 
by the direction of the spin of the particle, which seems plausible. In contrast to 
(17.1), the particle spin does not appear at time t 2 in (17.9). This is because we are 
dealing with a destructive measurement, and the value of the particle’s spin at t 2 is 
irrelevant. Indeed, the concept may not even be well defined. Thus |f2+) and |£2“) 
are defined on a slightly different Hilbert space than |£2) and |f2'). 

The counterpart of (17.2) is 


(fi+IQ - ) = 0, 


(17.10) 


a consequence of unitary time development: since the two states in (17.9) at time 
to are orthogonal to each other, those at t 2 must also be orthogonal. But (17.10) is 
also what one would expect on physical grounds for quantum states corresponding 
to distinct macroscopic situations, in this case different orientations of the pointer. 
The orthogonality in (17.2) was justified by assuming that the two emerging wave 
packets in Fig. 17.1 have negligible overlap. Two distinct pointer positions will 
mean that there are an enormous number of atoms whose wave packets have neg- 
ligible overlap, and thus (17.10) will be satisfied to an excellent approximation. 
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It follows from (17.9) and our assumption about the way in which |£2 + ) and |£2“) 
are related to the pointer position that if the particle starts off with S z — +1/2 at 
to, the pointer will be pointing upwards at t 2 , while if the particle starts off with 
S z — —1/2, the pointer will later point downwards. But what will happen if the 
initial spin state is not an eigenstate of 5,? Let us assume a spin state |x + ), (17.3), 
at to corresponding to S x = +1/2. The unitary time development of the initial state 

m = l* + )l^> = (|z + >|£ 2 > + |z">|n))/V2, (17.11) 

the macroscopic counterpart of (17.4), is given by 

I'l'o) h* |x + }|£2'} h). | £2) = (|£2+) + \Q~))/y/2. (17.12) 

The state |£2) on the right side is a macroscopic quantum superposition (MQS) 
of states representing distinct macroscopic situations: a pointer pointing up and a 
pointer pointing down. It is incompatible with the measurement outcomes £2+ and 
£2“ in the same way as the right side of (17.5) is incompatible with co + and co~ , 
so it cannot be used for describing the possible outcomes of the measurement. See 
the discussion in Sec. 9.6. 

The measurement outcomes can be discussed using a family resembling (17.6) 
with support 

, , [ [£ 2 + ], 

['hoi O [x + ][£2 ] © (17.13) 

Itsn. 

However, as pointed out in connection with (17.6), the presence of [x + ] in the 
histories in this family at times preceding the measurement makes it impossible to 
discuss S z . Thus one cannot employ (17.13) to obtain a correlation between the 
measurement outcomes and the value of S z before the measurement took place. 

Hence we are led to consider yet another family, the counterpart of (17.7), whose 
support is the two histories: 


['Pol© 


| [z + ][£2'] O [£2 + ]. 
(k-][S2']0[£2-]. 


(17.14) 


From it we can deduce the conditional probabilities 

Pr(z+ | £2+) = 1 = Pr ( z " | £2j ), 
Pr(£2+ | z|) = 1 = Pr(£2^ | z p. 


(17.15) 


which are the analogs of (17.8). The initial state can be thought of as one of the 
conditions, though it is not shown explicitly. 

However, (17.15), while technically correct, does not really provide the sort of 
result one wants from a macroscopic theory of measurement. What one would like 
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to say is: “Given the initial state and the fact that the pointer points up at the time 
h, S z must have had the value +1/2 at t \ While the state |£2 + ) is, indeed, a state 
of the apparatus for which the pointer is up, it does not mean the same thing as 
“the pointer points up”. There are an enormous number of quantum states of the 
apparatus consistent with “the pointer points up”, and | £2+) is just one of these, so it 
contains a lot of information in addition to the direction of the pointer. It provides 
a very precise description of the state of the apparatus, whereas what we would 
like to have is a conditional probability whose condition involves only a relatively 
coarse “macroscopic” description of the apparatus. One can also fault the use of 
the family (17.14) on the grounds that |'I' 0 ) is itself a very precise description of 
the initial state of the apparatus. In practice it is impossible to set up an apparatus 
in such a way that one can be sure it is in such a precise initial state. 

What we need are conditional probabilities which lead to the same conclusions 
as (17.15), but with conditions which involve a much less detailed description of 
the apparatus at to and h ■ Such coarse-grained descriptions in classical physics 
are provided by statistical mechanics. While quantum statistical mechanics lies 
outside the scope of this book, the histories formalism developed earlier provides 
tools which are adequate for the task at hand, and we shall use them in the next 
section to provide an improved version of macroscopic measurements. 


17.4 Macroscopic measurement, second version 

Physical properties in quantum theory are associated with subspaces of the Hilbert 
space, or the corresponding projectors. Often these are projectors on relatively 
small subspaces. However, it is also possible to consider projectors which corre- 
spond to macroscopic properties of a piece of apparatus, such as “the pointer points 
upwards”. We shall call such projectors “macro projectors”, since they single out 
regions of the Hilbert space corresponding to macroscopic properties. 

Let Z be a macro projector onto the initial state of the apparatus ready to carry 
out a measurement of the spin of the particle. It projects onto an enormous sub- 
space Z of the Hilbert space, one with a dimension, Tr[Z], which is of the order 
of e s,k , where S is the (absolute) thermodynamic entropy of the apparatus, and k 
is Boltzmann’s constant. Thus Tr[Z] could be 10 raised to the power 10 23 . Such 
a macro projector is not uniquely defined, but the ambiguity is not important for 
the argument which follows. It is convenient to include in Z the information about 
the center of mass of the particle at to, but not its spin. Similarly, the apparatus 
after the measurement can be described by the macro projectors Z + , projecting on 
a subspace Z + for which the pointer points up, and Z , projecting on a subspace 
Z~ for which the pointer points down. For reasons indicated in Sec. 17.3, any state 
in which the pointer is directed upwards will surely be orthogonal to any state in 
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which it is directed downwards, and thus 

Z+Z“ = 0. (17.16) 

Let {|£2 ; }}, j — 1,2,... be an orthonormal basis for Z. We assume that the 
unitary time evolution from to to t\ to t 2 takes the form 

\z + )\ttj) h * |z + }|£2'.} |S2t), 

\z )\nj)»\z \Qj), {l:A " 

for j = 1, 2, ... , and that for every j, 

Z+|S2+) = |£2+>, Z~\QJ) = |fij). (17.18) 

That is to say, whatever may be the precise initial state of the apparatus at to, if 
S z — +1/2 at this time, then at t 2 the apparatus pointer will be directed upwards, 
whereas if S z = —1/2 at to, the pointer will later be pointing downwards. Note 
that combining (17.16) with (17.18) tells us that for every j 

Z + \QJ) = 0 = Z~\aj). (17.19) 

Since the {|£2j~)} are mutually orthogonal — (17.17) represents a unitary time 
development — they span a subspace of the Hilbert space having the same dimen- 
sion, Tr[Z], as Z. Hence (17.18) can only be true if the subspace Z + onto which 
Z + projects has a dimension Tr[Z + ] at least as large as Tr[Z], and the same com- 
ment applies to Z~ . We expect the process which results in moving the pointer to 
a particular position to be irreversible in the thermodynamic sense: the entropy of 
the apparatus will increase during this process. Since, as noted earlier, the trace of 
a macro projector is on the order of e s/k , where S is the thermodynamic entropy, 
even a modest (macroscopic) increase in entropy is enough to make Tr[Z + ] (and 
likewise Tr[Z - ]) enormously larger than the already very large Tr[Z]: the ratio 
Tr[Z + ]/Tr[Z] will be 10 raised to a large power. There is thus no difficulty in sup- 
posing that (17.18) is satisfied, as there is plenty of room in Z + and Z~ to hold 
all the states which evolve unitarily from Z, and in this respect the unitary time 
development assumed in (17.17) is physically plausible. 

Now let us consider various families of histories based upon an initial state rep- 
resented by the projector 

4> 0 = [x + ] 0 Z, (17.20) 

which in physical terms means that the particle has S x = +1/2 and the apparatus 
is ready to carry out the measurement. Note that <f> 0 , in contrast to the pure state 
+o used in Sec. 17.3, is a projector on a very large subspace, and thus a relatively 
imprecise description of the initial state of the apparatus. 



238 


Measurements I 


Consider first the case of unitary time evolution starting with <J> 0 at to and leading 
to a state 


$2 = T(h,to)*oT(to,t 2 ) = Y^\Qj)(Qj\, (17.21) 

j 

at t 2 , where 

|%) = (|^+) + |f2T))/V2 (17.22) 

is an MQS state. None of the terms in the sum in (17.21) commutes with Z + and 
Z~, and it is easy to show that the same is true of the sum itself (that is, there are 
no cancellations). Since <t>o does not commute with Z + and Z , a history using 
unitary time evolution precludes any discussion of measurement outcomes. An- 
other way of stating this is that whatever the initial apparatus state, unitary time 
evolution will inevitably lead to an MQS state in which the pointer positions have 
no meaning. Hence it is essential to employ a nonunitary history in order to dis- 
cuss the measurement process; using macro projectors does not change our earlier 
conclusion in this respect. 

Likewise the counterpart of the family (17.13), in which one can discuss mea- 
surement outcomes but not the value of S z at t\, is unsatisfactory as a description 
of a measurement process for the same reason indicated earlier. Thus we are led to 
consider a family analogous to that in (17.14), whose support consists of the two 
histories 


d>oO 


f[z + ]OZ + , 

lfe-]OZ“ 


(17.23) 


involving events at the times to,t\,ti- Note that (in contrast to (17.14)) no mention 
is made of an apparatus state at t\, and of course no mention is made of a spin state 
at ? 2 - The histories <t>o O [z“] O Z + and <t>o O \z + 1 0 Z have zero weight in view 
of (17.19). 

As both histories in (17.23) have positive weight, it is clear that 


Pr(z+ | Z+) = 1 = Pr(z“ | Z 2 “), 
Pr(Z+ 1 4) = 1 = Pr(Z 2 - | 


(17.24) 


where the initial state <t>o can be thought of as one of the conditions, even though it 
is not shown explicitly. Thus if the pointer is directed upwards at t 2 , then S z had the 
value +1/2 at fi, while a pointer directed downwards at t 2 means that S z was —1/2 
at t \ . These results are formally the same as those in (17.15), but (17.24) is more 
satisfactory from a physical point of view in that the conditions (including the im- 
plicit 4>o) only involve “macroscopic” information about the measuring apparatus. 
Note that (17.14) is not misleading, even though its physical interpretation is less 
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satisfactory than (17.24), and the former is somewhat easier to derive. It is often 
the case that one can model a macroscopic measurement process in somewhat sim- 
plistic terms, and nonetheless obtain a plausible answer. Of course, if there are any 
doubts about this procedure, it is a good idea to check it using macro projectors. 

An alternative to the preceding discussion is an approach based upon statistical 
mechanics, which in its simplest form consists in choosing an appropriate basis 
{|£2 ; }} for the subspace corresponding to the initial state of the apparatus (the space 
on which Z projects), and assigning a probability pj to the state |fi,-). Assuming 
the correctness of (17.17) and (17.18), one can use the consistent family supported 
by the (enormous) collection of histories of the form 

[x + ]0[^]O[z + ]OZ+, 

[x + ] 0 [Qj] O [z~] O Z“, 

with j — 1, 2, . . . , to obtain (17.24). Note that consistency is ensured by Z+Z = 
0 along with the fact that the initial states for histories ending in Z + are mutually 
orthogonal, and likewise those ending in Z~. 

Yet another approach to the same problem is to describe the measuring apparatus 
at to by means of a density matrix p thought of as a pre-probability, as discussed in 
Sec. 15.6. Since p describes an apparatus in an initial ready state, the probability, 
computed from p, that the apparatus will not be in this state must be zero: 

Tr[p(7 - Z)] = 0. (17.26) 

Since both p and I — Z are positive operators, (17.26) implies, see (3.92), that 
p(I — Z) — 0, or 


Zp = p, (17.27) 

which means that the support of p (Sec. 3.9) falls in the subspace Z on which Z 
projects. Consequently, p may be written in the diagonal form 

p = 'E i Pj\n j )iaj\, (17.28) 

j 

where {|f2 ; }} is an orthonormal basis of Z. To be sure, this could be a different 
basis of Z from the one introduced earlier, but since the vectors in any basis can 
be expressed as linear combinations of vectors in the other, it follows that (17.17) 
and (17.18) will still be true. 

Given p in the form (17.28), the measurement process can be analyzed using the 
procedures of Sec. 15.6, including (15.48) for consistency conditions and (15.50) 
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for weights. Using these one can show that the two histories 


[x + ]Q 


Uz + ]OZ+, 

lfe-]OZ“ 


(17.29) 


form the support of a consistent family. This family resembles (17.23), except 
that the initial state \x + 1 at to contains no reference to the apparatus, since the 
initial state of the apparatus is represented by a density matrix. The weights of the 
histories in (17.29) are the same as for their counterparts in (17.23), and once again 
lead to the conditional probabilities in (17.24). 


17.5 General destructive measurements 

The preceding discussion of the measurement of S z for a spin-half particle can 
be easily extended to a schematic description of an idealized measuring process 
for a more complicated system S which interacts with a measuring apparatus M. 
The measured properties will correspond to some orthonormal basis {|s^)}, k — 
1 , 2, ... n of S, and we shall assume that the measurement process corresponds to 
a unitary time development from to to t\ to t 2 of the form 

|/> 0 I Mo) H* \s k ) 0 I MO \N k ), (17.30) 

where |Mo) and | Mi) are states of the apparatus at to and t\ before it interacts with 
S, and the (|7V*)} are orthonormal states on S 0 M. for which a measurement 
pointer indicates a definite outcome of the measurement. (The {|7V*>} are a pointer 
basis in the notation introduced at the end of Sec. 9.5.) 

Assume that at to the initial state of S 0 M. is 

l^o> = M 0 |M 0 >, (17.31) 

where 

l*o> = X>* \s k ), (17.32) 

k 

with l c *l 2 = 1’ ' s an arbitrary superposition of the basis states of S. Unitary 
time evolution will then result in a state 

l* 2 > = T(t 2 , f 0 )|^ 0 ) = (17.33) 

k 

at t 2 . Using the two-time family 

'I'o O 7 O {TV 1 , N 2 , . . . }, 


(17.34) 
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where square brackets have been omitted from [A^*], and regarding |^) as a pre- 
probability, one finds 

Pr (N k ) = |c*| 2 (17.35) 


for the probability of the kth outcome of the measurement at the time t 2 . 
One can refine (17.34) to a consistent family with support 


[s 1 ]© N\ 
[s 2 ] O N 2 , 

[s a ]ON n , 


(17.36) 


and from it derive the conditional probabilities 

Pr (s( | N k ) = 8 jk = Pr (N k \ s{), (17.37) 

assuming Pr(N k ) > 0- That is, given the measurement outcome N k at t 2 , S was in 
the state |s 4 ) before the measurement took place. Thus the measurement interaction 
results in an appropriate correlation between the later apparatus output and the 
earlier state of the measured system. 

The preceding analysis can be generalized to a measurement of properties which 
are not necessarily pure states, but form a decomposition of the identity 

i s = J2 sk < 17 - 38 ) 

k 

for system S, where some of the projectors are onto subspaces of dimension greater 
than 1 . This might arise if one were interested in the measurement of a physical 
variable of the form 


V = J2 v ' k S k , (17.39) 

k 

see (5.24), some of whose eigenvalues are degenerate. 

Let us assume that the subspace onto which S k projects is spanned by an or- 
thonormal collection / = 1, 2, . . . }, so that 

S k = |.v W )(.v W |. (17.40) 

i 

Assume that the counterpart of (17.30) is a unitary time development 

| s kl ) 0 | M 0 ) ^ | s u ) 0 |Mi) | N kl ), (17.41) 

where {|/V W )} is an orthonormal collection of states on S 0 M. labeled by both k 
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and l, and 

N k = ^\N U )(N U \ (17.42) 

i 

represents a property of M. corresponding to the kth measurement outcome. 

The counterpart of (17.36) is the consistent family 


'PoO 


s 1 0 IV 1 , 
S 2 o N 2 , 


S n O N n , 


where |'I / o) is given by (17.31), with 

ko) - ^c u \s kl ) 

kl 


(17.43) 


(17.44) 


the obvious counterpart of (17.32). Corresponding to (17.37) one has 

Pr(S/ | N k ) = S jk = Pr(N k \ S{), (17.45) 

with the physical interpretation that a measurement outcome N k at t 2 implies that 
S had the property S k at t\, and vice versa. 

The measurement schemes discussed in this section can be extended to a gen- 
uinely macroscopic description of the measuring apparatus in a straightforward 
manner using either of the approaches discussed in Sec. 17.4. 
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18.1 Beam splitter and successive measurements 

Sometimes a quantum system, hereafter referred to as a “particle”, is destroyed dur- 
ing a measurement process, but in other cases it continues to exist in an identifiable 
form after interacting with the measuring apparatus, with some of its properties 
unchanged or related in a nontrivial way to properties which it possessed before 
this interaction. In such a case it is interesting to ask what will happen if a second 
measurement is carried out on the particle: how will the outcome of the second 
measurement be related to the outcome of the first measurement, and to properties 
of the particle between the two measurements? 

Let us consider a specific example in which a particle (photon or neutron) passes 
through a beam splitter B and is then subjected to a measurement by nondestructive 
detectors located in the c and d output channels as shown in Fig. 18.1. Assume 
that the unitary time development of the particle in the absence of any measuring 
devices is given by 


|0a) h* (lie) + [\d))/y/l h* (|2 c) + \2d))/V2 h* • • • (18.1) 


as time progresses from to to t\ to ?2 Here the kets denote wave packets whose 

approximate locations are shown by the circles in Fig. 18.1, and the labels are 
similar to those used for the toy model in Ch. 12. 

The detectors are assumed to register the passage of the particle while having a 
negligible influence on the time development of its wave packet. Toy detectors with 
this property were introduced earlier, in Secs. 7.4 and 12.3. To actually construct 
such a device in the laboratory is much more difficult, but, at least for some types 
of particle, not out of the question. We assume that the interaction of the particle 
with the detector C in Fig. 18.1 leads to a unitary time development during the 
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Fig. 18.1. Beam splitter followed by nondestructive measuring devices. The circles indi- 
cate the locations of wave packets corresponding to different kets. 


interval from t\ to t 2 of the form 


|lc>|C°>h+ |2c)|C*}, 
\\d)\C°)^\2d)\C°), 


(18.2) 


where \C°) denotes the “ready” or “untriggered” state of the detector, and |C*) the 
“triggered” state orthogonal to |C°). (The tensor product symbol, as in | lc) 0 |C°), 
has been omitted.) The behavior of the other detectors C and D is similar, and thus 
an initial state 


|^> = |0a>|C°>|C°>|D°> (18.3) 

develops unitarily to 

l*t> = (|lc) + \ld)]\C°)\C°)\D°)/V2, (18.4) 

|* 2 > = (|2c)|C*)|C°)|D°) + \2d)\C°)\C°)\D°))/V2, (18.5) 

|* 3 > = (|3c>|C*)|C*>|/>> + \3d)\C°)\C°)\D*))/V2 (18.6) 

at the times t \ , t 2 , t 2 . 


We shall be interested in families of histories based on the initial state | * 0 ) . The 
simplest one to understand in physical terms is a family T in which at the times t\, 
t 2 , and t-i every detector is either ready or triggered, and the particle is represented 
by a wave packet in one of the two output channels. The support of T consists of 
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the two histories 

Y c = O [lc]C 0 C 0 D° © [2 c]C*C°D° 0 [3 c]C*C*D°, 

Y d = O [ld]C°C°D° © [2d]C°C°D° 0 [3d]C°C°D*, 

where square brackets have been omitted from [^o], [C°], etc., so that the formula 
remains valid if one employs macro projectors, as in Sec. 17.4. In Y c the particle 
moves out along channel c and triggers the detectors C and C as it passes through 
them, while in Y d the particle moves along channel d and triggers D. 

The situation described by these histories is essentially the same as it would be if 
a classical particle were scattered at random by the beam splitter into either the c or 
the d channel, and then traveled out along the channel triggering the corresponding 
detector(s). Thus if C is triggered at time t 2 the particle is surely in the c channel, 
and will later trigger C, whereas if C is still in its ready state at t 2 , this means the 
particle is in the d channel, and will later trigger D. That these assertions are in- 
deed correct for a quantum particle can be seen by working out various conditional 
probabilities, e.g. 

Prtflch | C 2 *) = 1 = Pr([Wh | C|), (18.8) 

Pr([2c] 2 | C*) = 1 = Pr([2 d] 2 \ C° 2 ), (18.9) 

Pr([3c] 3 | Cl) = 1 = Pr([3c?] 3 | C 2 °), (18.10) 

where the subscripts indicate the times, t\ or t 2 or f 3 , at which the events occur. 
Thus the location of the particle either before or after t 2 can be inferred from 
whether it has or has not been detected by C at t 2 . There are, in addition, cor- 
relations between the outcomes of the different measurements: 

Pr(C 3 *|q) = l, Pr(D 3 | C 2 ) = 0, (18.11) 

Pr(q|C 2 ) = l, Pr(Z) 3 | C 2 ) = 1. (18.12) 

Thus whether C or D will later detect the particle is determined by whether it was 
or was not detected earlier by C. 

The conditional probabilities in (18.8)— (18. 12) are straightforward consequences 
of the fact that all histories in T except for the two in (18.7) have zero probabil- 
ity. Since these conditional probabilities, with the exception of (18.9), involve 
more than two times — note that the initial d'o is implicit in the condition — they 
cannot be obtained by using the Born rule, and are therefore inaccessible to older 
approaches to quantum theory which lack the formalism of Ch. 10. These older 
approaches employ a notion of “wave function collapse” in order to get results 
comparable to (18.9)— (18.12), and this is the subject of the next section. 
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18.2 Wave function collapse 

Quantum measurements have often been analyzed using the following idea, which 
goes back to von Neumann. Consider an isolated system S, and suppose that its 
wave function evolves unitarily, so that it is |si) at a time t \ . At this time, or very 
shortly thereafter, S interacts with a measuring apparatus M. designed to determine 
whether S is in one of the states of a collection {li'*}} forming an orthonormal 
basis of the Hilbert space of S. The measurement will have an outcome k with 
probability | (v 1 15^} | 2 , and if the outcome is k the effect of the measurement will be 
to “collapse” or “reduce” |si) to |s*}. 

This collapse picture of a measurement proceeds in the following way when S 
is the particle and M. the detector C in Fig. 18.1. The particle undergoes unitary 
time evolution until it encounters the measuring apparatus, and thus at t\ it is in a 
state 


\\a) = (\\c) + \\d))/V2. (18.13) 

The detector at time t 2 is either still in its ready state |C°), or else in its triggered 
state |C*) indicating that it has detected the particle. If the particle has been de- 
tected, its wave function will have collapsed from its earlier delocalized state | la) 
into the |2c) wave packet localized in the c channel and moving towards detector C, 
which will later detect the particle. If, on the other hand, the particle has not been 
detected by C, its wave function will have collapsed into the packet 1 2d) localized 
in the d channel and moving towards the D detector, which will later register the 
passage of the particle. Consequently C* at t 2 results in C* at i 3 , whereas C° at 
t 2 implies D* at i 3 , in agreement with the conditional probabilities in (18.11) and 
(18.12). 

This “collapse” picture has long been regarded by many quantum physicists as 
rather unsatisfactory for a variety of reasons, among them the following. First, 
it seems somewhat arbitrary to abandon the state |'I / 2 ) obtained by unitary time 
evolution, (18.5), without providing some better reason than the fact that a mea- 
surement occurs; after all, what is special about a quantum measurement? Any 
real measurement apparatus is constructed out of aggregates of particles to which 
the laws of quantum mechanics apply, so the apparatus ought to be described by 
those laws, and not used to provide an excuse for their breakdown. Second, while 
it might seem plausible that an interaction sufficient to trigger a measuring appara- 
tus could somehow localize a particle wave packet somewhere in the vicinity of the 
apparatus, it is much harder to understand how the same apparatus by not detecting 
the particle manages to localize it in some region which is very far away. 

This second, nonlocal aspect of the collapse picture is particularly troublesome, 
and has given rise to an extensive discussion of “interaction-free measurements” in 
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which some property of a quantum system can be inferred from the fact that it did 
not interact with a measuring device. (We shall return to this subject in Sec. 21.5.) 
Since one can imagine the gedanken experiment in Fig. 18.1 set up in outer space 
with the wave packets |2c) and \2d) an enormous distance apart, there is also the 
problem that if wave function collapse takes place instantaneously it will conflict 
with the principle of special relativity according to which no influence can travel 
faster than the speed of light. 

By contrast, the analysis given in Sec. 18.1 based upon the family F, (18.7), 
shows no signs of any nonlocal effects. If C has not detected the particle at time t 2 , 
this is because the particle is moving out the d channel, not the c channel. In the 
case of a classical particle such an “interaction free measurement” of the channel in 
which it is moving gives rise to no conceptual difficulties or conflicts with relativity 
theory. As pointed out in Sec. 18.1, the family F provides a quantum description 
which resembles that of a classical particle, and thus by using this family one avoids 
the nonlocality difficulties of wave function collapse. 

Another way to avoid these difficulties is to think of wave function collapse 
not as a physical effect produced by the measuring apparatus, but as a mathemat- 
ical procedure for calculating statistical correlations of the type shown in (18.9)- 
(18.12). That is, “collapse” is something which takes place in the theorist’s note- 
book, rather than the experimentalist’s laboratory. In particular, if the wave func- 
tion is thought of as a pre-probability (Sec. 9.4), then it is perfectly reasonable to 
collapse it to a different pre-probability in the middle of a calculation. 

With reference to the arrangement in Fig. 18.1, the idea of wave function col- 
lapse corresponds fairly closely to a consistent family V with support 


'Fq O *Fi O 


[2 c]C*C°D° © [3c]C*C*D°, 
[2 d]C°C°D° O [3 d]C°C°D*. 


(18.14) 


These two histories represent unitary time evolution of the initial state, so they 
are identical up to the time t \ , before the particle interacts (or fails to interact) 
with C, but are thereafter distinct. As a consequence of the internal consistency of 
quantum reasoning, Sec. 16.3, this family gives the same results for the conditional 
probabilities in (18.9)— (18. 12) as does F. (Those in (18.8) are not defined in V.) 
In particular, either family can be used to predict the outcomes of later C and D 
measurements based upon the outcome of the earlier C measurement. 

One can imagine constructing the framework V in successive steps as follows. 
Use unitary time development up to h, but think of in (18.5) as a pre-probab- 
ility (rather than as representing an MQS property) useful for assigning probabili- 
ties to the two histories 


fl'o O 'Fi O [C*, C°], 


(18.15) 
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which form the support of a consistent family whose projectors at t 2 represent 
the two possible measurement outcomes. This is the minimum modification of 
a unitary family which can exhibit these outcomes. Next refine this family by 
including the corresponding particle properties at t 2 along with the ready states of 
the other detectors: 


%O^i0 


[2 c]C*C°D°, 
[2 d]C°C°D°. 


(18.16) 


Finally, use unitary extensions of these histories, Sec. 11.7, to obtain the family 
V of (18.14). In a more general situation the step from (18.15) to (18.16) can 
be more complicated: one may need to use conditional density matrices rather 
than projectors onto particle properties, as discussed in Sec. 15.7. But the general 
idea is still the same: information from the outcome of a measurement is used to 
construct a new initial state of the particle, which is then employed for calculating 
results at still later times. Wave function collapse is, in essence, an algorithm for 
constructing this new initial state given the outcome of the measurement. 

Wave function collapse is in certain respects analogous to the “collapse” of a 
classical probability distribution when it is conditioned on the basis of new infor- 
mation. Once again think of a classical particle randomly scattered by the beam 
splitter into the c or d channel. Before the particle (possibly) passes through C, 
it is delocalized in the sense that the probability is 1 /2 for it to be in either the c 
or the d channel. But when the probability for the location of the particle is con- 
ditioned on the measurement outcome it collapses in the sense that the particle is 
either in the c channel, given C*, or the d channel, given C°. This collapse of 
the classical probability distribution is obviously not a physical effect, and only in 
some metaphorical sense can it be said to be “caused” by the measurement. This 
becomes particularly clear when one notes that conditioning on the measurement 
outcome collapses the probability distribution at a time t\ before the measurement 
occurs in the same way that it collapses it at t 2 or t 2 after the measurement occurs. 
Thinking of the collapse as being caused by the measurement would lead to an odd 
situation in which an effect precedes its cause. 

Precisely the same comment applies to the collapse of a quantum wave function. 
A quantum description conditioned on a particular outcome of a measurement will 
generally provide more detail, and thus appear to be “collapsed”, in comparison 
with one constructed without this information. But since the outcome of a quantum 
measurement can also tell one something about the properties of the measured 
particle prior to the measurement process (assuming a framework in which these 
properties can be discussed) one should not think of the collapse as some sort of 
physical effect with a physical cause. To be sure, in the family (18.14) it is not 
possible to discuss the location (c or d) of the particle before the measurement, 
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because in this particular framework the location does not make sense. The implicit 
use of this type of family for discussions of quantum measurements is probably one 
reason why wave function collapse has often been confused with a physical effect. 
The availability of other families, such as T in (18.7), helps one avoid this mistake. 

In summary, when quantum mechanics is formulated in a consistent way, wave 
function collapse is not needed in order to describe the interaction between a par- 
ticle (or some other quantum system) and a measuring device. One can use a 
notion of collapse as a method of constructing a particular type of consistent fam- 
ily, as indicated in the steps leading from (18.15) to (18.16) to (18.14), or else 
as a picturesque way of thinking about correlations that in the more sober lan- 
guage of ordinary probability theory are written as conditional probabilities, as in 
(18.9)— (18. 12). However, for neither of these purposes is it actually essential; any 
result that can be obtained by collapsing a wave function can also be obtained in a 
straightforward way by adopting an appropriate family of histories. The approach 
using histories is more flexible, and allows one to describe the measurement pro- 
cess in a natural way as one in which the properties of the particle before as well 
as after the measurement are correlated to the measurement outcomes. 

While its picturesque language may have some use for pedagogical purposes or 
for constructing mnemonics, the concept of wave function collapse has given rise 
to so much confusion and misunderstanding that it would, in my opinion, be better 
to abandon it altogether, and instead use conditional states, such as the conditional 
density matrices discussed in Sec. 15.7 and in Sec. 18.5, and conditional probabili- 
ties. These are quite adequate for constructing quantum descriptions, and are much 
less confusing. 


18.3 Nondestructive Stern-Gerlach measurements 

The Stern-Gerlach apparatus for measuring one component of spin angular mo- 
mentum of a spin-half atom was described in Ch. 17. Here we shall consider a 
modified version which, although it would be extremely difficult to construct in 
the laboratory, does not violate any principles of quantum mechanics, and is useful 
for understanding why quantum measurements that are nondestructive for certain 
properties will be destructive for other properties. Figure 18.2 shows the modified 
apparatus, which consists of several parts. First, a magnet with an appropriate field 
gradient like the one in Fig. 17.1 separates the incoming beam into two diverging 
beams depending upon the value of S z , with the S z — +1/2 beam going upwards 
and the S z — —1/2 beam going downwards. There are then two additional mag- 
nets, with field gradients in a direction opposite to the gradient in the first magnet, 
to bend the separated beams in such a way that they are traveling parallel to each 
other. These beams pass through detectors D a and Di, of the nondestructive sort 
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employed in Fig. 18.1. We assume not only that these detectors produce a negli- 
gible perturbation of the spatial wave packets in each beam, but also that they do 
not perturb the z component of spin. (A detector in one beam and not the other 
would actually be sufficient, but using two emphasizes the symmetry of the situa- 
tion.) The detectors are followed by a series of magnets which reverse the process 
produced by the first set of magnets and bring the two beams back together again. 


L_i £ L_i 



Fig. 18.2. Modified Stern-Gerlach apparatus for nondestructive measurements of S z . 

The net result is that an atom with either S z — +1/2 or S z — —1/2 will traverse 
the apparatus and emerge in the same beam at the other end. The only difference 
is in the detector which is triggered while the atom is inside the apparatus. The 
unitary time evolution corresponding to the measurement process is 

|z+)|Z°) |z + )|Z+), |z-)|Z°> IO|Z-}, (18.17) 

where Iz*) are spin states corresponding to S z — ±1/2, |Z°) is the initial state of 
the apparatus, and |Z + ) and \Z~) are mutually orthogonal apparatus states corre- 
sponding to detection by the upper or by the lower detector in Fig. 18.2. One could 
equally well use macro projectors for the apparatus states, as in Sec. 17.4, and for 
this reason we will employ Z° and Z ± without square brackets as symbols for the 
corresponding projectors. In addition, the coordinate representing the center of 
mass of the atom is not shown in (18.17); omitting it will cause no confusion, and 
including it would merely clutter the notation. We shall assume that there are no 
magnetic fields outside the apparatus which could affect the atom’s spin, and that 
the apparatus states \Z°) and \Z±) do not change with time except when interacting 
with the atom, (18.17). The latter assumption is convenient, but not essential. 

It is obvious that the same type of apparatus can be used to measure other com- 
ponents of spin by using a different direction for the magnetic field gradient. For 
example, if the atom is thought of as moving along the y axis, then by simply rotat- 
ing the apparatus about this axis it can be used to measure S w for w any direction 
in the x, z plane. Alternatively, one could arrange for the atom to pass through 
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regions of uniform magnetic field before entering and after leaving the apparatus 
sketched in Fig. 18.2, in order to cause a precession of an atom with S w — ±1/2 
into one with S z — ±1/2, and then back again after the measurement is over. 

We will consider various histories based upon an initial state 

|vp 0 ) = \u + )\Z°), (18.18) 


at the time to, where the kets 

I u + ) = ±cos(tf/2)|z+) ±sin(tf/2)|0, 
\u~) = — sin(i?/2)|z + ) ±cos(#/2)|z - ), 


(18.19) 


see (4.14), correspond to S u = ±1/2 and —1/2 for a direction u in the x, z plane 
at an angle d to the +z axis, so that S u is equal to S- when d = 0, and S x when 
d — n 1 2 . 

Consider the consistent family with support 


4*0 O 


b + ]0Z+0b+], 

bl0Z-0[zl 


(18.20) 


where the projectors refer to an initial time to, a time t\ before the atom enters the 
apparatus, a time ^ when it has left the apparatus, and a still later time t 3 . The 
conditional probabilities 

Pr([z + ]i | Z+) = 1 = Pr(| z J, | Z 2 ) (18.21) 


show that the properties S z = ± 1 /2 before the measurement are correlated with the 
measurement outcomes, so that the apparatus does indeed carry out a measurement. 
In addition, the probabilities 

Pr(U + ] 3 | U + li) = 1 = Pr([z"]3 | U-]i), 

Pr([z"] 3 | U + li) = 0 = Pr([z+] 3 | [z"]i) 

show that the measurement process carried out by this apparatus is nondestructive 
for the properties [z + ] and [z - ]: they have the same values after the measurement 
as before. 

Next consider a different family whose support consists of the four histories 


'Po O [w + ] O 


|Z+ ©{[«+], [IT]}, 

\z- o {[«+], [«-]}. 


(18.23) 


Despite the fact that the four final projectors at / 3 are not all orthogonal to one 
another, the orthogonality of Z + and Z ensures consistency. It is straightforward 
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to work out the weights associated with the different histories in (18.23) using the 
method of chain kets, Sec. 1 1.6. One result is 

Pr([w + ] 3 1 [u + ] 0 = \{u + \z + )(z + \u + )\ 2 + |(m + |z“)(z“|m + )| 2 

= (cos(#/2)) 4 + (sin(i?/2)) 4 = (1 + cos 2 tf)/2. (18.24) 

Except for i? = 0 or n, the probability of \u + | at tj, is less than 1, meaning that 
the property S u — +1/2 has a certain probability of being altered when the atom 
interacts with the apparatus designed to measure S z . The disturbance is a maximum 
for d —n/2, which corresponds to S u — S x : indeed, the value of S x after the atom 
has passed through the device is completely random, independent of its earlier 
value. 


18.4 Measurements and incompatible families 

As noted in Sec. 16.4, the relationship of incompatibility between quantum frame- 
works does not have a good classical analog, and thus it has to be understood in 
quantum mechanical terms and illustrated through quantum examples. Quantum 
measurements can provide useful examples, and in this section we consider two: 
one uses a beam splitter as in Sec. 18.1, the other employs nondestructive Stem- 
Gerlach devices of the type described in Sec. 18.3. 

Think of a beam splitter, Fig. 18.3(a), similar to that in Fig. 18.1 except that 
there are no measuring devices in the output channels c and d. There is a consistent 
family whose support consists of the pair of histories 

[Oa]0{[lc],[W]} (18.25) 

at the times to and t\, where the notation is the same as in Sec. 18.1. The unitary 
time development in (18.1) implies that each history has a probability of 1/2. 

The closed box surrounding the apparatus in Fig. 18.3(a) means that we are 
thinking of it as an isolated quantum system. Because it is isolated, there is no di- 
rect way to check the probabilities associated with the family in (18.25). However, 
there is a strategy which can provide indirect evidence. Suppose that at some time 
later than t\ and just before the particle would collide with one of the walls of the 
box, two holes are opened, as shown in Fig. 18.3(b), allowing the particle to escape 
and be detected by one of the two detectors C and D. If the particle is detected by 
C, it seems plausible that it was earlier traveling outwards through the c and not the 
d channel; similarly, detection by D indicates that it was earlier in the d channel. 
Data obtained by repeating the experiment a large number of times can be used to 
check that each history in (18.25) has a probability of 1 /2. 

Could it be that opening the box along with the subsequent measurements per- 
turbs the particle in such a way as to invalidate the preceding analysis? This is 
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Fig. 18.3. Beam splitter inside closed box (a), with two possibilities (b) and (c) for a 
measurement if the particle is allowed to emerge through holes in the sides of the box. 


a perfectly legitimate question, one which could also come up when one opens a 
“classical” box in order to determine what is going on inside it: think of a box 
containing unexposed photographic film, or a compressed gas. While there is no 
way of addressing the classical box-opening problem in a manner fully accept- 
able to sceptical philosophers, physicists will be content if they are able to achieve 
some reasonable understanding of what is likely to be going on during the open- 
ing process. This may require auxiliary experiments, mathematical modeling, and 
a certain amount of theoretical reasoning. On the basis of these physicists might 
be reasonably confident when inferring something about the state of affairs in- 
side a box before it is opened, using information from observations carried out 
afterwards. 
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Given the internal consistency of quantum reasoning, and the fact that quantum 
principles have been verified time and time again in innumerable experiments, it is 
not unreasonable to use quantum theory itself in order to examine what will happen 
if holes are opened in the box in Fig. 18.3, and whether the detection of the particle 
by C is a good reason to suppose that it was in the c channel at t \ . Carrying out such 
an analysis is not difficult if one assumes, as is plausible, that a timely opening of 
the holes has no effect upon the unitary time evolution of the particle’s wave packet 
other than to allow it to propagate as it would have in the complete absence of any 
walls. The rest of the analysis is the same as in Sec. 18.1, and shows that the 
conditional probabilities (18.8) also apply to the present situation: if the particle is 
later detected by C, it was in the c channel inside the box at t\. 

An alternative consistent family has for its support the single unitary history 

[Oa] O [Id], (18.26) 

where |ld) is the superposition state defined in (18.13). This family is clearly 
incompatible with the one in (18.25) because [Id] does not commute with either 
[lc] or [Id]. Nonetheless, (18.26) is just as good a quantum description of the 
particle moving inside the closed box as is the pair of possibilities in (18.25). An 
experimental test which will confirm that the history (18.26) does, indeed, occur 
is shown in Fig. 18.3(c), and is only slightly more complicated than the one used 
earlier. Once again, holes are opened in the walls of the box just before the arrival 
of the particle, but now there are mirrors outside the holes and a second beam 
splitter, so one has a Mach-Zehnder interferometer. Let the path lengths be such 
that a particle in the state | Id) at time t\ will emerge from the second beam splitter 
in the / channel and trigger the detector F, whereas a particle in the orthogonal 
state 


|ld> = (— |lc) + |ld»/V2 (18.27) 

will emerge in channel e and trigger E. The experiment needs to be repeated many 
times in order to get a statistically significant result, and if in every, or almost every, 
run the particle is detected in F rather than E, one can infer that it was in the state 
[ Id] at the earlier time t \ . That this is a plausible inference follows once again from 
the fact that quantum mechanics is a consistent theory abundantly confirmed by a 
variety of experimental tests. 

It is obviously impossible to carry out the two types of measurements indicated 
in (b) and (c) of Fig. 18.3 on the same system during the same experimental run, 
and this is not surprising given the fact that while both (18.25) and (18.26) are valid 
quantum descriptions, they are mutually incompatible, so they cannot be applied 
to the same system at the same time. The “classical” macroscopic incompatibility 



18.4 Measurements and incompatible families 


255 


of the two experimental arrangements, in the sense that setting up one of them pre- 
vents setting up the other, mirrors the quantum incompatibility of the microscopic 
events which are measured in the two cases. Thus an analysis using measurements 
can assist one in gaining an intuitive understanding of the incompatibility of quan- 
tum events and frameworks. 

It has sometimes been suggested that certain conceptual difficulties associated 
with incompatible quantum frameworks could be resolved if there were a law of 
nature which specified the framework which had to be employed in any particular 
circumstance. That such an idea is not likely to work can be seen from the fact 
that either of the experiments indicated in Fig. 18.3 could in principle be carried 
out a large distance away and thus a long time after the particle emerges from 
the box, long enough to allow a choice to be made between the two experimental 
arrangements (see the discussion of delayed choice in Ch. 20). Thus were there 
such a law of nature, it would need to either determine the choice of the later 
experiment, or allow that later choice to influence the particle while it was still 
inside the box. Neither of these seems very satisfactory. 

A second example in which measurements are useful for understanding quantum 
incompatibility is shown in Fig. 18.4(a), in which a spin-half atom moving parallel 
to the y axis passes successively through two nondestructive Stern-Gerlach de- 
vices, represented schematically by squares, of the form shown in Fig. 18.2. At the 
times to, h, and t 2 the atom is (approximately) at the positions indicated by the dots 
in the figure. The first device measures S z , and its unitary time development during 
the interval from to to t\ is given by (18.17). The second device measures S x , and 
its unitary time development from t\ to t 2 is given by 

|* + >|X°> |x + )|X+), |jT)|X°> ^ k“)|2T), (18.28) 

where |X°), | A + ) and |X - ) are the initial state of the device and the states repre- 
senting possible outcomes of the measurement. 


(a) 


(b) 



Fig. 18.4. Spin-half atom passing through successive nondestructive Stern-Gerlach 
devices. 
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Given the starting state 

|'I'o> = |* + >|Z°>|X°> (18.29) 

at to, and that at <2 the detectors are in the states Z + and X + , what can one say about 
the spin of the atom at the time t\ when it is midway between the two devices? A 
relatively coarse family whose support is the four histories 

'I'o O / O {Z + X+, Z+X~, Z~X + , Z~X~} (18.30) 


is useful for representing the initial data (see Sec. 16.1) of 'ho at to and Z + X + at ti. 

The consistent family (18.30) can be refined in various ways. One possibility is 
to include information about S z at t \ : 


[z + ] O {Z+X+, Z+X ~ }, 
[z~] 0 {Z~X+, Z X }. 


Using this family one sees that 


Pr([z + ]i | Z+X+) = 1, 


(18.31) 


(18.32) 


so that the initial data imply that S z — + 1 /2 at t\ . A different refinement includes 
information about S x at t \ : 


[jc + ] O {Z+X+, Z~X+], 
[x~] O {Z + X~, Z~X~}. 


Using it one finds that 


Pr([* + h I Z+X+) = 1, 


(18.33) 


(18.34) 


so that in this framework the initial data imply that S x — + 1 /2 at t \ . Since \z + ] and 
[x+] do not commute, the frameworks (18.31) and (18.33) are incompatible, and 
the results (18.32) and (18.34) cannot be combined, even though each is correct in 
its own framework. 

There is, of course, no experimental arrangement by means of which either 
(18.32) or (18.34) can be checked directly at the precise time t\. The closest one 
can come is to insert another device W, as shown in Fig. 18.4(b), which carries 
out a nondestructive Stem-Gerlach measurement of S w for some direction in at a 
time shortly after t\. First consider the case w — z, so that the W apparatus repeats 
the measurement of the initial Z apparatus. One can show — the reader can easily 
work out the details — that with w — z, the Z and W devices have identical out- 
comes: Z + W + or Z~ W~ . Thus if at t 2 , when the atom has passed through all three 
devices, Z is in the state Z + , W will be in the state W + . This is precisely what one 
would have anticipated on the basis of (18.32): the property S z — +1/2 at t\ was 
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confirmed by the W measurement a short time later. In this sense the W device 
with w — z confirms the correctness of a conclusion reached on the basis of the 
consistent family in (18.31). On the other hand, if w — x, so that the W apparatus 
measures S x , a similar analysis shows that the X and W devices must have identi- 
cal outcomes. In particular, if at ?2 X is in the state X + , W will be in the state W + . 
and this confirms the correctness of (18.34). Since the device W must have its field 
gradient (the gradient in the first magnet in Fig. 18.2) in a particular direction, it is 
obvious that in a particular experimental run w is either in the x or in the z direc- 
tion, and cannot be in both directions simultaneously. The situation is thus si mi lar 
to what we found in the previous example: a classical macroscopic incompatibility 
of the two measurement possibilities reflects the quantum incompatibility of the 
two frameworks (18.31) and (18.33). 

How can we know that at time t\ the atom had the property revealed a bit later by 
the spin measurement carried out by IT? The answer to this question is the same as 
for its analog in the previous example. Quantum theory itself provides a consistent 
description of the situation, including the relevant connection between a property 
of the atom before a measurement takes place and the outcome of the measurement. 
One must, of course, employ an appropriate framework for this connection to be 
evident. For example, in the case w — x one should use a consistent family with 
[x + ] and [x~] at time t \ , for a family with \z + 1 and [z“] at t\ cannot, obviously, be 
used to discuss the value of S x . 

There is, however, another concern which did not arise in the previous example 
using the beam splitter. The device W in Fig. 18.4(b) is located where it might 
conceivably disturb the later S x measurement carried out by X. Can we say that 
the outcome of the latter, X + or X~, is the same as it would have been, for this 
particular experimental run, had the apparatus W been absent, as in Fig. 18.4(a)? 
This is a counterfactual question: given a situation in which W is in fact present, it 
asks what would have happened if, contrary to fact, W had been absent. Answering 
counterfactual questions requires a further development, found in Sec. 19.4, of the 
principles of quantum reasoning discussed in Ch. 16. By using it one can argue 
that both for the case w — x and also for the case w — z, had W been absent the 
X measurement outcome would have been the same. 


18.5 General nondestructive measurements 

In Sec. 17.5 we discussed a fairly general scheme for measurements, in general de- 
structive, of the properties of a quantum system S corresponding to an orthonormal 
basis { |.v*) }, by a measuring apparatus M initially in the state |Mo). To construct 
a corresponding description of nondestructive measurements, suppose that the uni- 
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tary time development from to to t\ to t 2 corresponding to (17.30) is of the form 

!/> ® I Mo) ^ |/> ® | Mj) |/> ® | M k ), (18.35) 


where the interaction between S and M. takes place during the time interval from 
t\ to t 2 , and {| M k )} is an orthonormal collection of states of A4 corresponding to 
the different measurement outcomes. 

Given some initial state |so) which is a linear combination of the {|s*}}, (17.32), 
one can set up a consistent family analogous to (17.36) with support 


[s 2 ] O [s 2 ] ® M 2 , 
[s n ] 0 [s n ] ® M n , 


(18.36) 


where I'I'o) is the state | .s’o) | A/ 0 ) . Using this family one can show that M k at t 2 
implies s k at t\ — (17.37) is valid with N k replaced by M k — and, in addition, 

Pr(s' | M|) = 8 jk = Pr (s{ | jf). (18.37) 

The first equality tells us that if at t 2 the measurement outcome is M k , the system 
S at this time is in the state \s k ), whereas the second shows that this measurement 
is nondestructive for the properties {[,U ]}. 

Provided S and M. do not interact for t > t 2 , the later time development of 
S (e.g., what will happen if it interacts with a second measuring apparatus AT) 
can be discussed using the method of conditional density matrices described in 
Sec. 15.7, with appropriate changes in notation: to. A, and B of Sec. 15.7 become 
t 2 , S, and A4. Given a measurement outcome M k , the corresponding conditional 
density matrix, see (15.61), is 

p k = [/], (18.38) 


and this can be used (typically as a pre-probability) as an initial state for the further 
time development of system S. (If there is a second measuring apparatus AT, one 
must, of course, also specify its initial state.) 

One can also formulate a nondestructive counterpart to the measurement of a 
general decomposition of the identity Is — (17.38), discussed in Sec. 17.5. 

Let the orthonormal basis (|s w )} be chosen so that S k — (17.40), and 

assume a unitary time development 

| s kl ) ® | M 0 ) ^ | s kl ) ® |Mj) \s kl ) ® | M k ), (18.39) 

where the apparatus state \M k ) corresponding to the kth outcome is assumed not 
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to depend upon l. The counterpart of (17.43) is a consistent family with support 


'J'o © 


5 1 os 1 0M 1 , 

5 2 O S 2 0 M 2 , 


(18.40) 


5" O 5" 0 M n , 


and it yields conditional probabilities 

Pr(S^ | M k ) = 8 jk = Pr (S' \ S k ) (18.41) 

that are the obvious counterpart of (18.37). In addition, the outcome M k at f 2 
implies the property S k at t \ : (17.45) holds with N k replaced by M k . 

It is possible to refine (18.40) to give a more precise description at f 2 . Define 

\o k ):=S k \s 0 ) = J2 Ck i\ skl '>' (18-42) 


using the expression (17.44) for \so). Then the unitary time development in (18.39) 
implies that 

T(h, t 0 )(\s 0 ) 0 | Mo)) = \° k ) ® \ Mk )- (18.43) 

k 

As a consequence, the histories 'P 0 O S k O (I — [cr A ]) 0 M k have zero weight, and 


5 1 O[a 1 ]0M 1 , 
S 2 © [a 2 ] 0 M 2 , 

S n O [cr n ] 0 M" 


(18.44) 


is again the support of a consistent family. Indeed, one can produce an even finer 
family by replacing each S k at t\ with the corresponding \o k \. 

In order to describe the later time development of S, assuming no further inter- 
action with M. for t > t 2 , one can again employ the method of conditional density 
matrices of Sec. 15.7, with 


p k = [o k ] (18.45) 

at time f 2 corresponding to the measurement outcome M k . If S is described by a 
density matrix p 0 at to, the corresponding result 

p k = S k p 0 S k /Tr(S k p 0 S k ) (18.46) 


is known as the Liiders rule. Note that the validity of both (18.41) and (18.42) 
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depends on some fairly specific assumptions. If, for example, one were to suppose 
that 

| s kl ) 0 | M 0 ) h* | s kl ) 0 | Mj> h* | s kl ) 0 | M kl ), (18.47) 

with the {| M kl )} for different k and l an orthonormal collection, and define 

M k = (18.48) 

/ 

as the projector corresponding to the klh measurement outcome, (18.41) would still 
be valid, but neither (18.45) nor (18.46) would (in general) be correct. 

The results in this section, like those in Sec. 17.5, can be generalized to the case 
of a macroscopic measuring apparatus using the approaches discussed in Sec. 17.4. 
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19.1 Quantum paradoxes 

The next few chapters are devoted to resolving a number of quantum paradoxes in 
the sense of giving a reasonable explanation of a seemingly paradoxical result in 
terms of the principles of quantum theory discussed earlier in this book. None of 
these paradoxes indicates a defect in quantum theory. Instead, when they have been 
properly understood, they show us that the quantum world is rather different from 
the world of our everyday experience and of classical physics, in a way somewhat 
analogous to that in which relativity theory has shown us that the laws appropriate 
for describing the behavior of objects moving at high speed differ in significant 
ways from those of pre-relativistic physics. 

An inadequate theory of quantum measurements is at the root of several quantum 
paradoxes. In particular, the notion that wave function collapse is a physical effect 
produced by a measurement, rather than a method of calculation, see Sec. 18.2, has 
given rise to a certain amount of confusion. Smuggling rules for classical reasoning 
into the quantum domain where they do not belong and where they give rise to 
logical inconsistencies is another common source of confusion. In particular, many 
paradoxes involve mixing the results from incompatible quantum frameworks. 

Certain quantum paradoxes have given rise to the idea that the quantum world 
is permeated by mysterious influences that propagate faster than the speed of light, 
in conflict with the theory of relativity. They are mysterious in that they cannot be 
used to transmit signals, which means that they are, at least in any direct sense, ex- 
perimentally unobservable. While relativistic quantum theory is outside the scope 
of this book, an analysis of nonrelativistic versions of some of the paradoxes which 
are supposed to show the presence of superluminal influences indicates that the 
real source of such ghostly effects is the need to correct logical errors arising from 
the assumption that the quantum world is behaving in some respects in a classical 
way. When the situation is studied using consistent quantum principles, the ghosts 
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disappear, and with them the corresponding difficulty in reconciling quantum me- 
chanics with relativity theory. The reason why ghostly influences cannot be used 
to transmit signals faster than the speed of light is then obvious: there are no such 
influences. 

Some quantum paradoxes are stated in a way that involves a free choice on the 
part of a human observer: e.g., whether to measure the x or the z component of spin 
angular momentum of some particle. Since the principles of quantum theory as 
treated in this book apply to a closed system, with all parts of it subject to quantum 
laws, a complete discussion of such paradoxes would require including the human 
observer as part of the quantum system, and using a quantum model of conscious 
human choice. This would be rather difficult to do given the current primitive state 
of scientific understanding of human consciousness. Fortunately, for most quantum 
paradoxes it seems possible to evade the issue of human consciousness by letting 
the outcome of a quantum coin toss “decide” what will be measured. As discussed 
in Sec. 19.2, the quantum coin is a purely physical device connected to a suitable 
servomechanism. By this means the stochastic nature of quantum mechanics can 
be used as a tool to model something which is indeterminate, which cannot be 
known in advance. 

Certain quantum paradoxes are stated in terms of counterfactuals: what would 
have happened if some state of affairs had been different from what it actually was. 
Other paradoxes have both a counterfactual as well as in an “ordinary” form. In 
order to discuss counterfactual quantum paradoxes, one needs a quantum version 
of counterfactual reasoning. Unfortunately, philosophers and logicians have yet 
to reach agreement on what constitutes valid counterfactual reasoning in the clas- 
sical domain. Our strategy will be to avoid the difficult problems which perplex 
the philosophers, such as “Would a kangaroo topple if it had no tail?”, and focus 
on a rather select group of counterfactual questions which arise in a probabilistic 
context. These are of the general form: “What would have happened if the coin 
flip had resulted in heads rather than tails?” They are considered first from a classi- 
cal (or everyday world) perspective in Sec. 19.3, and then translated into quantum 
terms in Sec. 19.4. 


19.2 Quantum coins 

In a world governed by classical determinism there are no truly random events. 
But quantum mechanics allows for events which are irreducibly probabilistic. For 
example, a photon is sent into a beam splitter and detected by one of two detec- 
tors situated on the two output channels. Quantum theory allows us to assign a 
probability that one detector or the other will detect the photon, but provides no 
deterministic prediction of which detector will do so in any particular realization 
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of the experiment. This system generates a random output in the same way as 
tossing a coin, which is why it is reasonable to call it a quantum coin. One can 
arrange things so that the probabilities for the two outcomes are not the same, or 
so that there are three or even more random outcomes, with equal or unequal prob- 
abilities. We shall use the term “quantum coin” to refer to any such device, and 
“quantum coin toss” to refer to the corresponding stochastic process. There is no 
reason in principle why various experiments involving statistical sampling (such as 
drug trials) should not be carried out using the “genuine randomness” of quantum 
coins. 

To illustrate the sort of thing we have in mind, consider the gedanken experiment 
in Fig. 19.1, in which a particle, initially in a wave packet |0a), is approaching a 
point P where a beam splitter B may or may not be located depending upon the 
outcome of tossing a quantum coin Q shortly before the particle arrives at P. If 
the outcome of the toss is Q', the beam splitter is left in place at B', whereas if it 
is Q", a servomechanism rapidly moves the beam splitter to B" out of the path of 
the particle, which continues in a straight line. 



Fig. 19.1. Particle paths approaching and leaving a beam splitter which is either left in 
place, B', or moved out of the way, B" , before the arrival of the particle. 

Let us describe this in quantum terms in the following way. Suppose that | Q) is 
the initial state of the quantum coin and the attached servomechanism at time to, 
and that between to and t\ there is a unitary time evolution 

\Q) 0 < 2 ') + \ Q"))/y/l. (19.1) 

Next, let | B') and | B") be states corresponding to the beam splitter being either 
left in place or moved out of the path of the particle, and assume a unitary time 
evolution 


\Q')\B’) | Q')\B'), \Q")\B') \Q")\B") 


(19.2) 
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between t\ and to. Finally, the motion of the particle from t 2 to to, is governed by 


\2a)\B') (|3 c) + \3d))\B')/V2, \2a)\B ") h* \3d)\B"), (19.3) 


where |2 a) is a wave packet on path a for the particle at time to, and a si mi lar 
notation is used for wave packets on paths c and d in Fig. 19.1. The overall unitary 
time evolution of the system consisting of the particle, the quantum coin, and the 
apparatus during the time interval from to until to, takes the form 

l^o) = |0a> 0 \Q)\B') | la) 0 (| Q') + \Q"))\B')/V2 

^ |2a)0(ie'>|S'> + ie">|S">)/V2 

(|3c) + |3d>) 0 \Q')\B')/2 + |3d) 0 \Q")\B") /s/2, (19.4) 


where 0 helps to set the particle off from the rest of the quantum state. 

There are reasons, discussed in Sec. 17.4, why macroscopic objects are best 
described not with individual kets but with macro projectors, or statistical distribu- 
tions or density matrices. The use of kets is not misleading, however, and it makes 
the reasoning somewhat simpler. With a little effort — again, see Sec. 17.4 — one 
can reconstruct arguments of the sort we shall be considering so that macroscopic 
properties are represented by macro projectors. While we will continue to use the 
simpler arguments, projectors representing macroscopic properties will be denoted 
by symbols without square brackets, as in (19.5), so that the formulas remain un- 
changed in a more sophisticated analysis. 

Consider the consistent family for the times to < t\ < t 2 < to, with support 
consisting of the two histories 


T le'Ofi'O [3d], 
vpo O i 

| Q" O B" © [3d], 

where 


1 3d) := (|3c> + |3d>)/V2. 


(19.5) 


(19.6) 


It allows one to say that if the quantum coin outcome is Q ' , then the particle is later 
in the coherent superposition state |3d), a state which could be detected by bringing 
the beams back together again and passing them through a second beam splitter, 
as in Fig. 18.3(c). On the other hand, if the outcome is Q " , then the particle will 
later be in channel d in a wave packet |3d). As [3d] and [3d] do not commute with 
each other, it is clear that these final states in (19.5) are dependent, in the sense 
discussed in Ch. 14, either upon the earlier beam splitter locations | B') and | B"), 
or the still earlier outcomes | Q ') and | Q") of the quantum coin toss. 



19.3 Stochastic counterfactuals 


265 


The expressions in (19.4) are a bit cumbersome, and the same effect can be 
achieved with a somewhat simpler notation in which (19.1) and (19.2) are replaced 
by the single expression 

l«o> >-> {\B') + \B"))/j2, (19.7) 

where |Bo) is the initial state of the entire apparatus, including the quantum coin 
and the beam splitter, whereas | B') and | B") are apparatus states in which the beam 
splitter is at the locations B' and B" indicated in Fig. 19.1. The time development 
of the particle in interaction with the beam splitter is given, as before, by (19.3). 


19.3 Stochastic counterfactuals 

A workman falls from a scaffolding, but is caught by a safety net, so he is not 
injured. What would have happened if the safety net had not been present? This 
is an example of a counterfactual question, where one has to imagine something 
different from what actually exists, and then draw some conclusion. Answering it 
involves counterfactual reasoning, which is employed all the time in the everyday 
world, though it is still not entirely understood by philosophers and logicians. In 
essence it involves comparing two or more possible states-of-affairs, often referred 
to as “worlds”, which are similar in certain respects and differ in others. In the 
example just considered, a world in which the safety net is present is compared to 
a world in which it is absent, while both worlds have in common the feature that 
the workman falls from the scaffolding. 

We begin our study of counterfactual reasoning by looking at a scheme which 
is able to address a limited class of counterfactual questions in a classical but 
stochastic world, that is, one in which there is a random element added to clas- 
sical dynamics. The world of everyday experience is such a world, since classical 
physics gives deterministic answers to some questions, but there are others, e.g., 
“What will the weather be two weeks from now?”, for which only probabilistic 
answers are available. 

Shall we play badminton or tennis this afternoon? Let us toss a coin: H (heads) 
for badminton, T (tails) for tennis. The coin turns up T, so we play tennis. What 
would have happened if the result of the coin toss had been HI It is useful to in- 
troduce a diagrammatic way of representing the question and deriving an answer, 
Fig. 19.2. The node at the left at time t\ represents the situation before the coin 
toss, and the two nodes at h are the mutually-exclusive possibilities resulting from 
that toss. The lower branch represents what actually occurred: the toss resulted in 
T and a game of tennis. To answer the question of what would have happened if 
the coin had turned up the other way, we start from the node representing what ac- 
tually happened, go backwards in time to the node preceding the coin toss, which 
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we shall call the pivot, and then forwards along the alternative branch to arrive 
at the badminton game. This type of counterfactual reasoning can be thought of 
as comparing histories in two “worlds” which are identical at all times up to and 
including the pivot point t\ at which the coin is tossed. After that, one of these 
worlds contains the outcome H and the consequences which flow from this, in- 
cluding a game of badminton, while the other world contains the outcome T and 
its consequences. 


H 



Badminton 

Tennis 


h h t 3 


Fig. 19.2. Diagram for counterfactual analysis of a coin toss. 

It is instructive to embed the preceding example in a slightly more complicated 
situation. Let us suppose that the choice between tennis or badminton was preceded 
by another: should we go visit the museum, or get some exercise? Once again, 
imagine the decision being made by tossing a coin at time to, with H leading to 
exercise and T to a museum visit. At the museum a choice between visiting one 
of two ex hi bits can also be carried out by tossing a coin. The set of possibilities is 
shown in Fig. 19.3. Suppose that the actual sequence of the two coins was H\T 2 , 
leading to tennis. If the first coin toss had resulted in 7j rather than H\, what 
would have happened? Start from the tennis node in Fig. 19.3, go back to the pivot 
node P 0 at to preceding the first coin toss, and then forwards on the alternative, T\ 
branch. This time there is not a unique possibility, for the second coin toss could 
have been either H 2 or T 2 . Thus the appropriate answer would be: Had the first coin 
toss resulted in 7), we would have gone to one or the other of the two exhibits at 
the museum, each possibility having probability 1/2. That counterfactual questions 
have probabilistic answers is just what one would expect if the dynamics describing 
the situation is stochastic, rather than deterministic. The answer is deterministic 
only in the limiting cases of probabilities equal to 1 or 0. 

However, a somewhat surprising feature of stochastic counterfactual reasoning 
comes to light if we ask the question, again assuming the afternoon was devoted to 
tennis, “What would have happened if the first coin had turned up Hi (as it actually 
did)?”, and attempt to answer it using the diagram in Fig. 19.3. Let us call this a 
null counterfactual question, since it asks not what would have happened if the 
world had been different in some way, but what would have happened if the world 
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H 2 



Badminton 
Tennis 
Exhibit 1 
Exhibit 2 


Fig. 19.3. Diagram for analyzing two successive coin tosses. 


had been the same in this particular respect. The answer obtained by tracing from 
“tennis” backwards to Po in Fig. 19.3 and then forwards again along the upper, 
or Hi branch, is not tennis, but it is badminton or tennis, each with probability 
1 /2. We do not, in other words, reach the conclusion that what actually happened 
would have happened had the world been the same in respect to the outcome of 
the first coin toss. Is it reasonable to have a stochastic answer, with probability 
less than 1, for a null counterfactual question? Yes, because to have a deterministic 
answer would be to specify implicitly that the second coin toss turned out the way 
it actually did. But in a world which is not deterministic there is no reason why 
random events should not have turned out differently. 

Counterfactual questions are sometimes ambiguous because there is more than 
one possibility for a pivot. For example, “What would we have done if we had not 
played tennis this afternoon?” will be answered in a different way depending upon 
whether Hi or P 0 in Fig. 19.3 is used as the pivot. In order to make a counterfactual 
question precise, one must specify both a framework of possibilities, as in Fig. 19.3, 
and also a pivot, the point at which the actual and counterfactual worlds, identical 
at earlier times, “split apart”. 

This method of reasoning is useful for answering some types of counterfactual 
questions but not others. Even to use it for the case of a workman whose fall is 
broken by a safety net requires an exercise in imagination. Let us suppose that just 
after the workman started to fall (the pivot), the safety net was swiftly removed, or 
left in place, depending upon some rapid electronic coin toss, so that the situation 
could be represented in a diagram similar to Fig. 19.2. Is this an adequate, or 
at least a useful, way of thinking about this counterfactual question? At least it 
represents a way to get started, and we shall employ the same idea for quantum 
counterfactuals. 
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19.4 Quantum counterfactuals 

Counterfactuals have played an important role in discussions of quantum measure- 
ments. Thus a perennial question in the foundations of quantum theory is whether 
measurements reveal pre-existing properties of a measured system, or whether they 
somehow “create” such properties. Suppose, to take an example, that a Stem- 
Gerlach measurement reveals the value S x — 1/2 for a spin-half particle. Would 
the particle have had the same value of S x even if the measurement had not been 
made? An interpretation of quantum theory which gives a “yes” answer to this 
counterfactual question can be said to be realistic in that it affirms the existence of 
certain properties or events in the world independent of whether measurements are 
made. (For some comments on realism in quantum theory, see Ch. 27.) Another 
similar counterfactual question is the following: Given that the S x measurement 
outcome indicates, using an appropriate framework (see Ch. 17), that the value of 
S x was +1/2 before the measurement, would this still have been the case if S z had 
been measured instead of S x ? 

The system of quantum counterfactual reasoning presented here is designed to 
answer these and similar questions. It is quite similar to that introduced in the 
previous section for addressing classical counterfactual questions. It makes use 
of quantum coins of the sort discussed in Sec. 19.2, and diagrams like those in 
Figs. 19.2 and 19.3. The nodes in these diagrams represent events in a consistent 
family of quantum histories, and nodes connected by lines indicate the histories 
with finite weight that form the support of the family. We require that the family 
be consistent, and that all the histories in the diagram belong to the same consis- 
tent family. This is a single-framework rule for quantum counterfactual reasoning 
comparable to the one discussed in Sec. 16.1 for ordinary quantum reasoning. 

Let us see how this works in the case in which S x is the component of spin 
actually measured for a spin-half particle, and we are interested in what would 
have been the case if S z had been measured instead. Imagine a Stem-Gerlach 
apparatus of the sort discussed in Sec. 17.2 or Sec. 18.3, arranged so that it can be 
rotated about an axis (in the manner indicated in Sec. 18.3) to measure either S x or 
S z . When ready to measure S x its initial state is \X°), and its interaction with the 
particle results in the unitary time development 

|* + > 0 \X°) \X + ), \x~) ® |X°) |A _ ). (19.8) 

Similarly, when oriented to measure S z the initial state is |Z°), and the correspond- 
ing time development is 

\z + ) 0 |Z°) h* |Z+), IO 0 \Z°) H- \Z~). (19.9) 

The symbols X°, etc., without square brackets will be used to denote the corre- 
sponding projectors. Because they refer to macroscopically distinct states, all the 
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Z projectors are orthogonal to all the X projectors: X + Z + — 0, etc. Without loss 
of generality we can consider the quantum coin and the associated servomecha- 
nism to be part of the Stem-Gerlach apparatus, which is initially in the state |A), 
with the coin toss corresponding to a unitary time development 

|A> (|X°> + |Z°>)/V2. (19.10) 


Assume that the spin-half particle is prepared in an initial state \w + ), where the 
exact choice of w is not important for the following discussion, provided it is not 
+x, —x, +z, or —z. Suppose that X + is observed: the quantum coin resulted in 
the apparatus state X° appropriate for a measurement of S x , and the outcome of 
the measurement corresponds to S x — +1/2. What would have happened if the 
quantum coin toss had, instead, resulted in the apparatus state Z° appropriate for 
a measurement of S z ? To address this question we must adopt some consistent 
family and identify the event which serves as the pivot. As in other examples of 
quantum reasoning, there is more than one possible family, and the answer given 
to a counterfactual question can depend upon which family one uses. Let us begin 
with a family whose support consists of the four histories 


'ho Of O 


x°o 

z°o 



(19.11) 


at the times to < h < t 2 < h, where |'L 0 ) = |u> + ) 0 |A) is the initial state. It is 
represented in Fig. 19.4 in a diagram resembling those in Figs. 19.2 and 19.3. The 
quantum coin toss (19.10) takes place between t\ and h. The particle reaches the 
Stem-Gerlach apparatus and the measurement occurs between to and ti, and at ti 
the outcome of the measurement is indicated by one of the four pointer states (end 
of Sec. 9.5) X ± , Z ± . Notice that only the first branching in Fig. 19.4, between t\ 
and t 2 , corresponds to the alternative outcomes of the quantum coin toss, while the 
later branching is due to other stochastic quantum processes. 

Suppose S x was measured with the result X + . To answer the question of what 
would have occurred if S z had been measured instead, start with the X + vertex 
in Fig. 19.4, trace the history back to I at t\ (or at to) as a pivot, and then go 
forwards on the lower branch of the diagram through the Z° node. The answer 
is that one of the two outcomes Z + or Z would have occurred, each possibility 
having a positive probability which depends on w, which seems reasonable. Rather 
than using the nodes in Fig. 19.4, one can equally well use the support of the 
consistent family written in the form (19.1 1), as there is an obvious correspondence 
between the nodes in the former and positions of the projectors in the latter. From 
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Fig. 19.4. Diagram for counterfactual analysis of the family (19.1 1). 


now on we will base counterfactual reasoning on expressions of the form (19.11), 
interpreted as diagrams with nodes and lines in the fashion indicated in Fig. 19.4. 

Now ask a different question. Assuming, once again, that X + was the actual out- 
come, what would have happened if the quantum coin had resulted (as it actually 
did) in 1° and thus a measurement of S x ? To answer this null counterfactual ques- 
tion, we once again trace the actual history in (19.11) or Fig. 19.4 backwards from 
X + at t 3 to the I or the 'Fq node, and then forwards again along the upper branch 
through the 1° node at t 2 , since we are imagining a world in which the quantum 
coin toss had the same result as in the actual world. The answer to the question is 
that either X + or X~ would have occurred, each possibility having some positive 
probability. Since quantum dynamics is intrinsically stochastic in ways which are 
not limited to a quantum coin toss, there is no reason to suppose that what actually 
did occur, X + , would necessarily have occurred, given only that we suppose the 
same outcome, X° rather than Z°, for the coin toss. 

Nevertheless, it is possible to obtain a more definitive answer to this null coun- 
terfactual question by using a different consistent family with support 

fx°ox+, 


'FoO 


[*+]© 


[*-]© 


Z°QU+, 

X°QX~, 

z° o u~, 


(19.12) 


where the nodes [x*] at t\, a time which precedes the quantum coin toss, cor- 
respond to the spin states S x — ±1/2, and U + and U~ are defined in the next 
paragraph. The history which results in X + can be traced back to the pivot [x + ], 
and then forwards again along the same (upper) branch, since we are assuming that 
the quantum coin toss in the alternative (counterfactual) world did result in the X° 
apparatus state. The result is X + with probability 1. That this is reasonable can 
be seen in the following way. The actual measurement outcome X + shows that 
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the particle had S x = +1/2 at time t\ before the measurement took place, since 
quantum measurements reveal pre-existing values if one employs a suitable frame- 
work. And by choosing [x + ] at t\ as the pivot, one is assuming that S x had the 
same value at this time in both the actual and the counterfactual world. Therefore 
a later measurement of S x in the counterfactual world would necessarily result in 
X *. 

However, we find something odd if we use (19.12) to answer our earlier coun- 
terfactual question of what would have happened if S z had been measured rather 
than S x . Tracing the actual history backwards from X + to [x + ] and then forwards 
along the lower branch in the upper part of (19.12), through Z°, we reach U + at t 3 
rather than the pair Z + , Z , as in (19.11) or Fig. 19.4. Here U + is a projector on 
the state 1 1/ + ) obtained by unitary time evolution of |x + )|Z°) using (19.9): 

|x + )|Z°) = (| z+) + \z~))\Z°)/V2 1*7+) = (| Z+) + |Z-»/V2. (19.13) 

Similarly, U~ in (19.12) projects on the state obtained by unitary time evolution 
of |x“)|Z°). Both U + and U~ are macroscopic quantum superposition (MQS) 
states. The appearance of these MQS states in (19.12) reflects the need to construct 
a family satisfying the consistency conditions, which would be violated were we 
to use the pointer states Z + and Z at f 3 following the Z° nodes at t 2 - The fact that 
consistency conditions sometimes require MQS states rather than pointer states 
is significant for analyzing certain quantum paradoxes, as we shall see in later 
chapters. 

The contrasting results obtained using the f ami lies in (19.11) and (19.12) illus- 
trate an important feature of quantum counterfactual reasoning of the type we are 
discussing: the outcome depends upon the family of histories which is used, and 
also upon the pivot. In order to employ the pivot [x + ] rather than 1 at t\, it is nec- 
essary to use a family in which the former occurs, and it cannot simply be added to 
the family (19. 1 1) by a process of refinement. To be sure, this dependence upon the 
framework and pivot is not limited to the quantum case; it also arises for classical 
stochastic counterfactual reasoning. However, in a classical situation the frame- 
work is a classical sample space with its associated event algebra, and framework 
dependence is rather trivial. One can always, if necessary, refine the sample space, 
which corresponds to adding more nodes to a diagram such as Fig. 19.3, and there 
is never a problem with incompatibility or MQS states. 

Consider a somewhat different question. Suppose the actual measurement out- 
come corresponds to S x — +1/2. Would S x have had the same value if no measure- 
ment had been carried out? To address this question, we employ an obvious mod- 
ification of the previous gedanken experiment, in which the quantum coin leads 
either to a measurement of S x , as actually occurred, or to no measurement at all, 
by swinging the apparatus out of the way before the arrival of the particle. Let 
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\N) denote the state of the apparatus when it has been swung out of the way. An 
appropriate consistent family is one with support 

\X°Q X + , 


^oO 


[x + ]Q 


[x"]0 


N O [x+], 

x°o x~, 

N © [or]. 


(19.14) 


It resembles (19.12), but with Z° replaced by N, U + by [x + ], and U~ by [x ], 
since if no measuring apparatus is present, the particle continues on its way in the 
same spin state. 

We can use this family and the node [x + ] at time t\ to answer the question of 
what would have happened in a case in which the measurement result was S x = 
+ 1/2 if, contrary to fact, no measurement had been made. Start with the X + node 
at * 3 , trace it back to [x + ] at t\, and then forwards in time through the N node at h. 
The result is [x + ], so the particle would have been in the state S x — +1/2 at t\ and 
at later times if no measurement had been made. 
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20.1 Statement of the paradox 

Consider the Mach-Zehnder interferometer shown in Fig. 20. 1 . The second beam 
splitter can either be at its regular position B in where the beams from the two 
mirrors intersect, as in (a), or moved out of the way to a position B out , as in (b). 
When the beam splitter is in place, interference effects mean that a photon which 
enters the interferometer through channel a will always emerge in channel / to be 
measured by a detector F. On the other hand, when the beam splitter is out of the 
way, the probability is 1/2 that the photon will be detected by detector E, and 1 /2 
that it will be detected by detector F. 



Fig. 20.1. Mach-Zehnder interferometer with the second beam splitter (a) in place, (b) 
moved out of the way. 


The paradox is constructed in the following way. Suppose that the beam splitter 
is out of the way, Fig. 20.1(b), and the photon is detected in E. Then it seems 
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plausible that the photon was earlier in the d arm of the interferometer. For ex- 
ample, were the mirror M d to be removed, no photons would arrive at E; if the 
length of the path in the d arm were doubled by using additional mirrors, the pho- 
ton would arrive at E with a time delay, etc. On the other hand, when the beam 
splitter is in place, we understand the fact that the photon always arrives at F as due 
to an interference effect arising from a coherent superposition state of photon wave 
packets in both arms c and d. That this is the correct explanation can be supported 
by placing phase shifters in the two arms, Sec. 13.2, and observing that the phase 
difference must be kept constant in order for the photon to always be detected in 
F. Similarly, removing either of the mi rrors will spoil the interference effect. 

Suppose, however, that the beam splitter is in place until just before the photon 
reaches it, and is then suddenly moved out of the way. What will happen? Since 
the photon does not interact with the beam splitter, we conclude that the situation 
is the same as if the beam splitter had been absent all along. If the photon arrives 
at E, then it was earlier in the d arm of the interferometer. But this seems strange, 
because if the beam splitter had been left in place, the photon would surely have 
been detected by F, which requires, as noted above, that inside the interferometer 
it is in a superposition state between the two arms. Hence it would seem that a 
later event, the position or absence of the beam splitter as decided at the very last 
moment before its arrival must somehow influence the earlier state of the photon, 
when it was in the interferometer far away from the beam splitter, and determine 
whether it is in one of the individual arms or in a superposition state. How can this 
be? Can the future influence the past? 

The reader may be concerned that given the dimensions of a typical labora- 
tory Mach-Zehnder interferometer and a photon moving with the speed of light, it 
would be physically impossible to shift the beam splitter out of the way while the 
photon is inside the interferometer. But we could imagine a very large interfero- 
meter constructed someplace out in space so as to allow time for the mechanical 
motion. Also, modified forms of the delayed choice experiment can be constructed 
in the laboratory using tricks involving photon polarization and fast electronic 
devices. 

It is possible to state the paradox in counterfactual terms. Suppose the beam 
splitter is not in place and the photon is detected by E, indicating that it was earlier 
in the d arm of the interferometer. What would have occurred if the beam splitter 
had been in place? On the one hand, it seems reasonable to argue that the photon 
would certainly have been detected by F; after all, it is always detected by F when 
the beam splitter is in place. On the other hand, experience shows that if a photon 
arrives in the d channel and encounters the beam splitter, it has a probability of 1 /2 
of emerging in either of the two exit channels. This second conclusion is hard to 
reconcile with the first. 
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20.2 Unitary dynamics 

Let |0a) be the photon state at to when the photon is in channel a, Fig. 20.1, just 
before entering the interferometer through the first (immovable) beam splitter, and 
let the unitary evolution up to a time t\ be given by 

|0a) | ld> := (|lc) + | W>)/ V 2, (20. 1) 

where |lc) and |1 d) are photon wave packets in the c and d arms of the interfer- 
ometer. These in turn evolve unitarily, 

|lc)h+|2c), \\d)y+\2 d), (20.2) 

to wave packets |2c) and \2d) in the c and d arms at a time h just before the photon 
reaches the second (movable) beam splitter. 

What happens next depends upon whether this beam splitter is in or out. If it is 
in, then 

B in : 1 2c) 1 3c), \2 d) h* |3 d), (20.3) 

where 

1 3c) := (|3c) + |3/y)/V5, |3 d) := (-|3c) + |3/))/V2, (20.4) 

and 1 3c) and |3/) are photon wave packets at time in the c and the / output 
channels. If the beam splitter is out, the behavior is rather simple: 

B out : 1 2c) |3 /), \2d) |3c). (20.5) 

Finally, the detection of the photon during the time interval from f 3 to is described 
by 

|3c)|£°) | E*), \3f)\F°) |F*). (20.6) 

Here | E°) and | F°) are the ready states of the two detectors, and | E*) and | F*) the 
states in which a photon has been detected. 

The overall time development starting with an initial state 

1*0) = |Qfl)|£°)|F°) (20.7) 

at time to leads to a succession of states | ) at time tj . These can be worked out by 

putting together the different transformations indicated in (20.1)-(20.6), assuming 
the detectors do not change except for the processes indicated in (20.6). For j > 2 
the result depends upon whether the (second) beam splitter is in or out. At 1 4 with 
the beam splitter in one finds 


B in : |* 4 ) = |£°)|F*), 


( 20 . 8 ) 
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whereas if the beam splitter is out, the result is a macroscopic quantum superposi- 
tion (MQS) state 

Bout : 1^4} = |S + ) := (|£*)|£°) + \E°)\F*))/V2. (20.9) 

A second MQS state 

|S“) := (~\E*)\F°) + \E°)\F*))/V2, (20.10) 

orthogonal to |S + ), will be needed later. 


20.3 Some consistent families 

Let us first consider the case in which the beam splitter is out. Unitary evolution 
leading to the MQS state |S + ), (20.9), at t 4 obviously does not provide a satis- 
factory way to describe the outcome of the final measurement. Consequently, we 
begin by considering the consistent family whose support consists of the two his- 
tories 


B out : 4 > 0 O [la] O [2a] © [3c] © {E* , £*} (20.11) 


at the times to < h < h < h < t 4 . Here and later we use symbols without 
square brackets for projectors corresponding to macroscopic properties; see the 
remarks in Sec. 19.2 following (19.4). This family resembles ones used for wave 
function collapse, Sec. 18.2, in that there is unitary time evolution preceding the 
measurement outcomes. For this reason, however, it does not allow us to make the 
inference required in the statement of the paradox in Sec. 20.1, that if the photon 
is detected by E (final state £*), it was earlier in the d arm of the interferometer. 
Such an assertion at t\ or t 2 is incompatible with [Id] or [2d], as these projectors 
do not commute with the projectors C, D for the photon to be in the c or the d 
arm. (For toy versions of C and D, see (12.9) in Sec. 12.1.) In order to translate 
the paradox into quantum mechanical terms we need to use a different consistent 
family, such as the one with support 


B 


out * 


| [lc] O [2c] O [3/] O F*, 
I [Id] O [2d] O [3e] O E*. 


( 20 . 12 ) 


Each of these histories has weight 1 /2, and using this family one can infer that 


B out : Pr([ld]! | £ 4 ) = Pr([2d ] 2 | £ 4 *) = 1, (20.13) 


where, as usual, subscripts indicate the times of events. That is, if the photon is 
detected by £ with the beam splitter out, then it was earlier in the d and not in the c 
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arm of the interferometer. Note, however, that using the consistent family (20.11) 
leads to the equally valid result 

B out : Pr([l«] ! | El) = Pr([2d] 2 | E%) = 1. (20.14) 

The single-framework rule prevents one from combining (20.13) and (20.14), be- 
cause the families (20.11) and (20.12) are mutually incompatible. 

Next, consider the situation in which the beam splitter is in place. In this case 
the unitary history 


B in : V 0 © [Id] © [2d] O [3/] O F* (20.15) 

allows one to discuss the outcome of the final measurement. It describes the photon 
using coherent superpositions of wave packets in the two arms at times t\ and t 2 , 
as suggested by the statement of the paradox. Based upon it one can conclude that 

B iri : Prtfldh | F '*) = Pr([2d] 2 | F 4 *) - P (20.16) 

which is the analog of (20.14). (While (20.14) and (20.16) are correct as written, 
one should note that the conditions E* and F* at t 4 are not necessary, and the prob- 
abilities are still equal to 1 if one omits the final detector states from the condition. 
It is helpful to think of 'I'o as always present as a condition, even though it is not 
explicitly indicated in the notation.) On the other hand, it is also possible to con- 
struct the counterpart of (20.12) in which the photon is in a definite arm at t\ and 
f 2 , using the family with support 

f [1 c] © [2c] © [3c] © S+, 

B in : 'To O | _ (20.17) 

[[Id] O [2d] O [3d] O S~, 

where S + and S~ are projectors onto the MQS states defined in (20.9) and (20.10). 
Note that the MQS states in (20.17) cannot be replaced with pairs of pointer states 
{E*, F*} as in (20.11), since the four histories would then form an inconsistent 
family. See the toy model example in Sec. 13.3. 

It is worth emphasizing the fact that there is nothing “wrong” with MQS states 
from the viewpoint of fundamental quantum theory. If one supposes that the usual 
Hilbert space structure of quantum mechanics is the appropriate sort of mathemat- 
ics for describing the world, then MQS states will be present in the theory, because 
the Hilbert space is a linear vector space, so that if it contains the states |£ , *}|F°} 
and |£ , °)|F*), it must also contain their linear combinations. However, if one is 
interested in discussing a situation in which a photon is detected by a detector, 
(20.17) is not appropriate, as within this framework the notion that one detector or 
the other has detected the photon makes no sense. 
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Let us summarize the results of our analysis as it bears upon the paradox stated 
in Sec. 20. 1 . No consistent families were actually specified in the initial statement 
of the paradox, and we have used four different families in an effort to analyze it: 
two with the beam splitter out, (20.1 1) and (20.12), and two with the beam splitter 
in, (20.15) and (20.17). In a sense, the paradox is based upon using only two 
of these families, (20.15) with 6,„ and the photon in a superposition state inside 
the interferometer, and (20.12) with B out and the photon in a definite arm of the 
interferometer. By focusing on only these two f ami lies — they are, of course, only 
specified implicitly in the statement of the paradox — one can get the misleading 
impression that the difference between the photon states inside the interferometer 
in the two cases is somehow caused by the presence or absence of the beam splitter 
at a later time when the photon leaves the interferometer. But by introducing the 
other two families, we see that it is quite possible to have the photon either in a 
superposition state or in a definite arm of the interferometer both when the beam 
splitter is in place and when it is out of the way. Thus the difference in the type 
of photon state employed at t\ and t 2 is not determined or caused by the location 
of the beam splitter; rather, it is a consequence of a choice of a particular type of 
quantum description for use in analyzing the problem. 

One can, to be sure, object that (20.17) with the detectors in MQS states at u 
is hardly a very satisfactory description of a situation in which one is interested 
in which detector detected the photon. It is true that if one wants a description in 
which no MQS states appear, then (20.15) is to be preferred to (20.17). But notice 
that what the physicist does in employing this altogether reasonable criterion is 
somewhat analogous to what a writer of a novel does when changing the plot in 
order to have the ending work out in a particular way. The physicist is selecting 
histories which at u will be useful for addressing the question of which detector 
detected the photon, and not whether the detector system will end up in S + or 
S~, and for this purpose (20.15), not (20.17) is appropriate. Were the physicist 
interested in whether the final state was S + or S~, as could conceivably be the case 
— e.g., when trying to design some apparatus to measure such superpositions — 
then (20.17), not (20.15), would be the appropriate choice. Quantum mechanics as 
a fundamental theory allows either possibility, and does not determine the type of 
questions the physicist is allowed to ask. 

If one does not insist that MQS states be left out of the discussion, then a com- 
parison of the histories in (20.12) and (20.17), which are identical up to time t 2 
while the photon is still inside the interferometer, and differ only at later times, 
shows the beam splitter having an ordinary causal effect upon the photon: events at 
a later time depend upon whether the beam splitter is or is not in place, and those at 
an earlier time do not. The relationship between these two families is then similar 
to that between (20.11) and (20.15), where again the presence or absence of the 
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beam splitter when the photon leaves the interferometer can be said to be the cause 
of different behavior at later times. Causality is actually a rather subtle concept, 
which philosophers have been arguing about for a long time, and it seems unlikely 
that quantum theory by itself will contribute much to this discussion. However, the 
possibility of viewing the presence or absence of the beam splitter as influencing 
later events should at the very least make one suspicious of the alternative claim 
that its location influences earlier events. 


20.4 Quantum coin toss and counterfactual paradox 

Thus far we have worked out various consistent families for two quite distinct 
situations: the beam splitter in place, or moved out of the way. One can, however, 
include both possibilities in a single framework in which a quantum coin is tossed 
while the photon is still inside the interferometer, with the outcome of the toss fed 
to a servomechanism which moves the beam splitter out of the way or leaves it in 
place at the time when the photon leaves the interferometer. This makes it possible 
to examine the counterfactual formulation of the delayed choice paradox found at 
the end of Sec. 20.1. 

The use of a quantum coin for moving a beam splitter was discussed in Sec. 19.2, 
and we shall use a simplified notation similar to (19.7). Let |Bo) be the state of 
the quantum coin, servomechanism, and beam splitter prior to the time t\ when 
the photon is already inside the interferometer, and suppose that during the time 
interval from t\ to t2 the quantum coin toss occurs, leading to a unitary evolution 

\B 0 ) ^ (| B in ) + \B out ))/V2, (20.18) 

with the states | B in ) and | B out ) corresponding to the beam splitter in place or re- 
moved from the path of the photon. The unitary time development of the photon 
from t 2 to t-i, in agreement with (20.3) and (20.5), is given by the expressions 

\2c)\B in ) \3c)\B in ), \2d)\B in ) h* \3d)\B in ), 

(20.19) 

\2c)\B ou t) |3/>|S 0Uf >, \2d)\B out ) ^ \3e)\ B out ). 

The unitary time development of the initial state 

|^o> = |0a>|S 0 >|£°)|F°) (20.20) 

can be worked out using the formulas in Sec. 20.2 combined with (20.18) and 
(20.19). In order to keep the notation simple, we assume that the apparatus states 
| Bo ) , | B in ), | B out ) do not change except during the time interval from t\ to t 2 , 
when the change is given by (20.18). The reader may find it helpful to work out 
| Qj) — T ( tj , f 0 )|n 0 ) at different times. At t\, when the photon has been detected, 
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it is given by 

l^4> = (|B ! -„)|£°>|/ ; ’*> + \B out )\S+))/V2. (20.21) 

Suppose the quantum coin toss results in the beam splitter being out of the way 
at the moment when the photon leaves the interferometer, and that the photon is 
detected by E. What would have occurred if the coin toss had, instead, left the 
beam splitter in place? As noted in Sec. 19.4, to address such a counterfactual 
question we need to use a particular consistent family, and specify a pivot. The 
answers to counterfactual questions are in general not unique, since one can employ 
more than one family, and more than one pivot within a single family. 

Consider the family whose support consists of the three histories 


£2 0 © [la] O 


Bin O [3/] O F*, 

B 0 ut © [3c] O {E*, F*} 


( 20 . 22 ) 


at the times to < t\ < ti < < U. Note that B out and E* occur on the lower line, 

and we can trace this history back to [Id] at t\ as the pivot, and then forwards again 
along the upper line corresponding to B in , to conclude that if the beam splitter had 
been in place the photon would have been detected by F. This is not surprising 
and certainly not paradoxical. (Note that having the E detector detect the photon 
when the beam splitter is absent is quite consistent with the photon having been in 
a superposition state until just before the time of its detection; this corresponds to 
(20.11) in Sec. 20.3.) To construct a paradox we need to be able to infer from E* 
at *4 that the photon was earlier in the d arm of the interferometer. This suggests 
using the consistent family whose support is 


^oO 


[Id] © B in © [3/] © F*, 
[1 c] O B out © [3/] © F*, 
[Id] O B out © [3e] O E*, 


(20.23) 


rather than (20.22). (The consistency of (20.23) follows from noting that one of the 
two histories which ends in F* is associated with B jn and the other with B out , and 
these two states are mutually orthogonal, since they are macroscopically distinct.) 
The events at t\ are contextual in the sense of Ch. 14, with [Id] dependent upon 
Bi n , while [lc] and [Id] depend on B out . 

The family (20.23) does allow one to infer that the photon was earlier in the d 
arm if it was later detected by E, since E* occurs only in the third history, preceded 
by [Id] at t\. However, since this event precedes B out but not B in , it cannot serve 
as a pivot for answering a question in which the actual B out is replaced by the 
counterfactual B in . The only event in (20.23) which can be used for this purpose is 
£2o- Using Q 0 as a pivot, we conclude that had the beam splitter been in, the photon 



20.4 Quantum coin toss and counterfactual paradox 


281 


would surely have arrived at detector F, which is a sensible result. However, the 
null counterfactual question, “What would have happened if the beam splitter had 
been out of the way (as in fact it was)?”, receives a rather indefinite, probabilistic 
answer. Either the photon would have been in the d arm and detected by E, or it 
would have been in the c arm and detected by F. Thus using £2 0 as the pivot means, 
in effect, answering the counterfactual question after erasing the information that 
the photon was detected by E rather than by F, or that it was in the d arm rather 
than the c arm. Hence if we use the family (20.23) with f2 0 as the pivot, the 
original counterfactual paradox, with its assumption that detection by E implied 
that the photon was earlier in d, and then asking what would have occurred if this 
photon had encountered the beam splitter, seems to have disappeared, or at least it 
has become rather vague. 

To be sure, one might argue that there is something paradoxical in that the super- 
position state [la] in (20.23) is present in the B in history, whereas nonsuperposition 
states [lc] and [1 d] precede B out . Could this be a sign of the future influencing the 
past? That is not very plausible, for, as noted in Ch. 14, the sort of contextuality we 
have here, with the earlier photon state depending on the later B in and B out , reflects 
the way in which the quantum description has been constructed. If there is an influ- 
ence of the future on the past, it is rather like the influence of the end of a novel on 
its beginning, as noted in the previous section. Or, to put it in somewhat different 
terms, this influence manifests itself in the theoretical physicist’s notebook rather 
than in the experimental physicist’s laboratory. 

What might come closer to representing the basic idea behind the delayed choice 
paradox is a family in which [1 d] at t\ can serve as a pivot for a counterfactual 
argument, rather than having to rely on Oq at to- Here is such a family: 


[lc] O 


£2 0 O 


[W]Q 


Bin O [3c] o S+, 
Bout O [3/] O F*, 

Bin O [3d] O S , 
Bout O [3c] O E*. 


(20.24) 


If we use [1 d] at t\ as the pivot for a case in which the beam splitter is out and the 
photon is detected in E, it gives a precise answer to the null counterfactual question 
of what would have happened had the beam splitter been out (as it actually was): 
the photon would have been detected by E and not by F. But now when we ask 
what would have happened had the beam splitter been left in place, the answer is 
that the system of detectors would later have been in the MQS state S~. In the same 
way, if the photon is detected in F when the beam splitter is out, a counterfactual 
argument using [lc] at t\ as the pivot leads to the conclusion that had the beam 
splitter been in, the detectors would later have been in the MQS state S + , which is 



282 


Delayed choice paradox 


orthogonal to, and hence quite distinct from S~. Thus detection in F rather than 
E when the beam splitter is out leads to a different counterfactual conclusion, in 
contrast with what we found earlier when using f2 0 as the pivot. That the answers 
to our counterfactual questions involve MQS states is hardly surprising, given the 
discussion in Sec. 20.3. And, as in the case of (20.17), the MQS states in (20.24) 
cannot be replaced with ordinary pointer states (as defined at the end of Sec. 9.5) 
E* and F* of the detectors, for doing so would result in an inconsistent family. 
Also note the analogy with the situation considered in Sec. 19.4, where looking for 
a framework which could give a more precise answer to a counterfactual question 
involving a spin measurement led to a family (19.12) containing MQS states. 

Let us summarize the results obtained by using a quantum coin and studying 
various consistent families related to the counterfactual statement of the delayed 
choice paradox. We have looked at three different frameworks, (20.22), (20.23), 
and (20.24), and found that they give somewhat different answers to the question 
of what would have happened if the beam splitter had been left in place, when 
what actually happened was that the photon was detected in E with the beam split- 
ter out. (Such a multiplicity of answers is typical of quantum and — to a lesser 
degree — classical stochastic counterfactual questions; see Sec. 19.4.) In the end, 
none of the frameworks supports the original paradox, but each framework evades 
it for a somewhat different reason. Thus (20.22) does not have photon states lo- 
calized in the arms of the interferometer, (20.23) has such states, but they cannot 
be used as a pivot for the counterfactual argument, and remedying this last prob- 
lem by using (20.24) results in the counterfactual question being answered in terms 
of MQS states, which were certainly not in view in the original statement of the 
paradox. 


20.5 Conclusion 

The analysis of the delayed choice paradox given above provides some useful 
lessons on how to analyze quantum paradoxes of this general sort. Perhaps the 
first and most important lesson is that a paradox must be turned into an explicit 
quantum mechanical model, complete with a set of unitary time transformations. 
The model should be kept as simple as possible: there is no point in using long 
expressions and extensive calculations when the essential elements of the paradox 
and the ideas for untangling it can be represented in a simple way. Indeed, the 
simpler the representation, the easier it will be to spot the problematic reasoning 
underlying the paradox. In the interests of simplicity we used single states, rather 
than macroscopic projectors or density matrices, for the measuring apparatus, and 
for discussing the outcomes of a quantum coin toss. A more sophisticated approach 
is available, see Sec. 17.4, but it leads to the same conclusions. 
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A second lesson is that in order to discuss a paradox, it is necessary to introduce 
an appropriate framework, which will be a consistent family if the paradox involves 
time development. There will, typically, be more than one possible framework, 
and it is a good idea to look at several, since different frameworks allow one to 
investigate different aspects of a situation. 

A third lesson has to do with MQS states. These are usually not taken into ac- 
count when stating a paradox, and this is not surprising: most physicists do not 
have any intuitive idea as to what they mean. Nevertheless, families containing 
MQS states may be very useful for understanding where the reasoning underlying 
a paradox has gone astray. For example, a process of implicitly (and thus uncon- 
sciously) choosing families which contain no MQS states, and then inferring from 
this that the future influences the past, or that there are mysterious nonlocal influ- 
ences, lies behind a number of paradoxes. This becomes evident when one works 
out various alternative f ami lies of histories and sees what is needed in order to 
satisfy the consistency conditions. 
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21.1 Statement of the paradox 

The paradox of indirect measurement, often called interaction-free measurement, 
can be put in a form very similar to the delayed choice paradox discussed in 
Ch. 20. Consider a Mach-Zehnder interferometer, Fig. 21.1, with two beam split- 
ters, which are always present. A mirror M can be placed either (a) in the c arm 
of the interferometer, where it reflects the photon out of this arm into channel g, 
thus preventing it from reaching the second beam splitter, or (b) outside the c arm, 
in a place where it has no effect. The two positions of M are denoted by M in and 
M out . Detectors E, F, and G detect the photon when it emerges from the appara- 
tus in channels e, f, or g. With M out of the way, the path differences inside the 
interferometer are such that a photon which enters through channel a will always 
emerge in channel /, so the photon will always be detected by F. With M in place, 
a photon which passes into the c channel cannot reach the second beam splitter Bi. 
However, a photon which reaches B 2 by passing through the d arm can emerge in 
either the e or the / channel, with equal probability. As a consequence, for M in 
the probabilities for detection by E, F, and G are 1/4, 1/4, and 1/2, respectively. 

Detection of a photon by G can be thought of as a measurement indicating that 
the mirror was in the position M in rather than M out . It is a partial measurement of 
the mirror’s position in that while a photon detected by G implies the mirror is in 
place, the converse is not true: the mirror can be in place without the photon being 
detected by G, since it might have passed through the d arm of the interferometer. 
Detection of the photon by E can likewise be thought of as a measurement indicat- 
ing that M is in the c arm, since when M is not there the photon is always detected 
by F. Detection by E is an indirect measurement that M is in place, in contrast to 
the direct measurement which occurs when G detects the photon. And detection 
by E is also a partial measurement: it can only occur, but does not always occurs 
when M is in the c arm. 
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Fig. 21.1. Mach-Zehnder interferometer with extra mirror M located (a) in the c arm, (b) 
outside the interferometer. 


The indirect measurement using E seems paradoxical for the following reason. 
In order to reach E, the photon must have passed through the d arm of the interfer- 
ometer, since the c arm was blocked by M. Hence the photon was never anywhere 
near M, and could not have interacted with M. How, then, could the photon have 
been affected by the presence or absence of the mirror in the c arm, that is, by 
the difference between M in and M out 2 How could it “know” that the c arm was 
blocked, and that therefore it was allowed to emerge (with a certain probability) in 
the e channel, an event not possible had M been outside the c arm? 

The paradox becomes even more striking in a delayed choice version analogous 
to that used in Ch. 20. Suppose the mirror M is initially not in the c arm. However, 
just before the time of arrival of the photon — that is, the time the photon would 
arrive were it to pass through the c arm — M is either left outside or rapidly moved 
into place inside the arm by a servomechanism actuated by a quantum coin flip 
which took place when the photon had already passed the first beam splitter Si. In 
this case one can check, see the analysis in Sec. 21.4, that if the photon was later 
detected in E, M must have been in place blocking the c arm at the instant when 
the photon would have struck it had the photon been in the c arm. That is, despite 
the fact that the photon arriving in E was earlier in the d arm it seems to have been 
sensitive to the state of affairs existing far away in the c arm at just the instant when 
it would have encountered M ! Is there any way to explain this apart from some 
mysterious nonlocal influence of M upon the photon? 
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A yet more striking version of the delayed choice version comes from contem- 
plating an extremely large interferometer located somewhere out in space, in which 
one can arrange that the entire decision process as to whether or not to place M in 
the c arm occurs during a time when the photon, later detected in E, is at a point 
on its trajectory through the d arm of the interferometer which is space-like sepa- 
rated (in the sense of relativity theory) from the relevant events in the c arm. Not 
only does one need nonlocal influences; in addition, they must travel faster than 
the speed of light! One way to avoid invoking superluminal signals is by assum- 
ing a message is carried, at the speed of light, from M to the second beam splitter 
B 2 in time to inform the photon arriving in the d arm that it is allowed to leave 
B 2 in the e channel, rather than having to use the / channel, the only possibility 
for M out . The problem, of course, is to find a way of getting the message from 
M to B 2 , given that the photon is in the d arm and hence unavailable for this 
task. 

A counterfactual version of this paradox is readily constructed. Suppose that 
with M in the location M in blocking the c arm, the photon was detected in E. What 
would have occurred if M out had been the case rather than M in ? In particular, if 
the position of M was decided by a quantum coin toss after the photon was already 
inside the interferometer, what would have happened to the photon — which must 
have been in the d arm given that it later was detected by E — if the quantum coin 
had resulted in M remaining outside the c arm? Would the photon have emerged 
in the / channel to be detected by FI — this seems the only plausible possibility. 
But then we are back to asking the same sort of question: how could the photon 
“know” that the c arm was unblocked? 


21.2 Unitary dynamics 

The unitary dynamics for the system shown in Fig. 21.1 is in many respects the 
same as for the delayed choice paradox of Ch. 20, and we use a similar notation 
for the unitary time development. Let |0a) be a wave packet for the photon at 
to in the input channel a just before it enters the interferometer, |lc) and |1 d) be 
wave packets in the c and d arms of the interferometer at time t\, and |2c) and 
1 2 d) their counterparts at a time t 2 chosen so that if M is in the c arm the photon 
will have been reflected by it into a packet |2 g) in the g channel. At time f 3 the 
photon will have emerged from the second beam splitter in channel e or f — the 
corresponding wave packets are |3e) and |3/} — or will be in a wave packet |3g) 
in the g channel. Finally, at 1 4 the photon will have been detected by one of the 
three detectors in Fig. 21.1. Their ready states are IF 10 ), | F°), and |G°), with | E*), 
| F*), and |G*) the corresponding states when a photon has been detected. 



21.3 Comparing Mi„ and M out 
The unitary time development from to to t\ is given by 
|0a)h+ 1 15) := (|lc) + |15))/V2. 

For t\ to h it depends on the location of M : 

M om ■ |lc) h* 1 2c), |1 d) h* |2 d), 

M in : | lc) i — > |2g), 1 15) h* 1 25), 

with no difference between M, n and M out if the photon is in the d arm of the 
interferometer. For the time step from t 2 to the relevant unitary transformations 
are independent of the mirror position: 

1 2c) 1 3c) := (+|3c) + |3/))/V2, 

1 25) h* 1 35) := (— |3c) + |3/))/V2, (21.3) 

12 g) ^ |3g). 

The detector states remain unchanged from to to tj,, and the detection events be- 
tween ?3 and ?4 are described by: 

\3e)\E°)^\E*), \3f)\F°)»\F*}, |3g)|G°) |G*). (21.4) 

If the photon is not detected, the detector remains in the ready state; thus (21.4) is 
an abbreviated version of 

|3c)|£°)|F°)|G°) |F*)|F°)|G°), (21.5) 

etc. One could also use macro projectors or density matrices for the detectors, see 
Sec. 17.4, but this would make the analysis more complicated without altering any 
of the conclusions. 
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( 21 . 1 ) 

( 21 . 2 ) 


21.3 Comparing M,„ and M ou , 

In Sec. 20.3 we considered separate consistent f ami lies depending upon whether 
the second beam splitter was in or out. That approach could also be used here, but 
for the sake of variety we adopt one which is slightly different: a single family with 
two initial states at time to, one with the mirror in and one with the mirror out, 

l*o )\M out ), |*o)|M,-„K (21.6) 

each with a positive (nonzero) probability, where 

l*o) — |0a)|F°)|F°)|G°). (21.7) 

Here the mirror is treated as an inert object, so | M in ) and \M out ) do not change 
with time. They do, however, influence the time development of the photon state 
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as indicated in (21.2). Thus unitary time development of the first of the two states 
in (21.6) leads at time U to 

\E°)\F*)\G°)\M out ), (21.8) 


while the second results in a macroscopic quantum superposition (MQS) state 

|(-|^>|F°)|G 0 > + | J B 0 >|T’*>|G 0 ) + V2|£ 0 )|F°>|G*>)|M ( - n ). (21.9) 

Consider the consistent family with support given by the four histories 

VoMout O [la] O [2a] O [3/] © F*, (2110) 

V 0 M in © [Id] © [2s] O [35] © {£*, F*, G*}, 1 ; 


at the times to < h < h < h < U, where 

|2j) := {\2d) + \2g))/j2, |3s> := |(-|3e> + |3/> + V2|3g)) (21.11) 

are superposition states in which the photon is not located in a definite channel. 
This corresponds to unitary time development until the photon is detected, and 
then pointer states (as defined at the end of Sec. 9.5) for the detectors. It shows 
that E* and G* can only occur with M„ ? , and in this sense either of these events 
constitutes a measurement indicating that the mirror was in the c arm. There is 
nothing paradoxical about the histories in (21.10), because an important piece of 
the paradox stated in Sec. 21.1 was the notion that the photon detected by E must at 
an earlier time have been in the d arm of the interferometer. But since the projectors 
C and D for the particle to be in the c or the d arm do not commute with [la], the 
assertion that the photon was in one or the other arm of the interferometer at time 
t\ makes no sense when one uses (21.10), and the same is true at t 2 . 

Therefore let us consider a different consistent family with support 


VoMout O 

vko M in o 


[la] O [2a] O 

[lc] © [2 g] © 

[l d] © [2d] O 


[3/] O F*, 
[3g] O G*, 
j[3c] O E*, 
|[3/] O F*. 


( 21 . 12 ) 


Using this family allows us to assert that if the photon was later detected by E, then 
the mirror was in the c arm, and the photon itself was in the d arm while inside the 
interferometer, and thus far away from M. The photon states at time t\ in this 
family are contextual in the sense discussed in Ch. 14, since [lc] and [1 d] do not 
commute with [la], and the same is true for [Id] at t 2 . Thus [1 d] and [2d] depend, 
in the sense of Sec. 14.1, on M in , and it makes no sense to talk about whether the 
photon is in the c or the d arm if the mirror is out of the way, M out . For this reason 
it is not possible to use (21.12) in order to investigate what effect replacing M in 
with M out has on the photon while it is in arm d. Hence while (21.12) represents 
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some advance over (21.10) in stating the paradox, it cannot be used to infer the 
existence of nonlocal effects. 

As noted in Ch. 14, the fact that certain events are contextual should not be 
thought of as something arising from a physical cause; in particular, it is misleading 
to think of contextual events as “caused” by the events on which they depend, in 
the technical sense defined in Sec. 14. 1 . Thinking that the change from M out to M in 
in (21.12) somehow “collapses” the photon from a superposition into one localized 
in one of the arms is quite misleading. Instead, the appearance of a superposition 
in the M out case and not in the M in case reflects our decision to base a quantum 
description upon (21.12) rather than, for example, the family (21.10), where the 
photon is in a superposition both for M out and M in . 

One can also use a consistent family in which the photon is in a definite arm 
while inside the interferometer both when M is in and when it is out of the c arm, 
so that the c and d states are not contextual: 


M out O 

^oMin O 


J [lc] o [2c] o 
|[ k /] 0 [ 2</]0 

[lc] O [2g] O 

[ld] Q[2d]Q 


[3c] © S+, 
[3d] © S~, 
[3g] O G*, 
| [3c] © E*, 
|[3/] O F*. 


(21.13) 


The states S + and S~ are the macroscopic quantum superposition (MQS) states of 
detectors E and F as defined in (20.9) and (20.10). Just as in the case of the family 
(20.17) in Sec. 20.3, the MQS states in the last two histories in (21.13) cannot be 
eliminated by replacing them with E* and F*, as that would violate the consistency 
conditions. And since the projectors S + and S~ do not commute with E* and F*, 
contextuality has not really disappeared when (21.12) is replaced by (21.13): it has 
been removed from the events at t\ and t 2 , but reappears in the events at tj and t 4 . 
In particular, it would make no sense to look at the events at the final time t 4 in 
(21.13) and conclude that a detection of the photon by E* was evidence that the 
mirror M was in rather than out of the c arm. While such a conclusion would be 
valid using (21.10) or (21.12), it is not supported by (21.13) since in the latter E* 
only makes sense in the case M„ ? , and is meaningless with M out . 

The preceding analysis has uncovered a very basic problem. Using E* as a way 
of determining that M in is the case rather than M ou , is incompatible with using 
E* as an indication that the photon was earlier in the d rather than the c arm. 
For the former, (21.10) is perfectly adequate, as is (21.12). However, when we 
try to construct a family in which [lc] vs. [1 d] makes sense whether or not the 
mirror is blocking the c arm, the result, (21.13), is unsatisfactory, both because 
of the appearance of MQS states at t 4 and also because E* is now contextual in 
a way which makes it depend on M,„, so that it is meaningless in the case M out . 
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Hence the detection of the photon by E cannot be used to distinguish M in from 
M out if one uses (21.13). If this were a problem in classical physics, one could 
try combining results from (21.10), (21.12), and (21.13) in order to complete the 
argument leading to the paradox. But these are incompatible quantum frameworks, 
so the single-framework rule means that the results obtained using one of them 
cannot be combined with results obtained using the others. From this perspective 
the paradox stated in Sec. 21.1 arises from using rules of reasoning which work 
quite well in classical physics, but do not always function properly when imported 
into the quantum domain. 


21.4 Delayed choice version 

In order to construct a delayed choice version of the paradox, we suppose that a 
quantum coin is connected to a servomechanism, and during the time interval be- 
tween t\ and f 15 , while the photon is inside the interferometer but before it reaches 
the mirror M, the coin is tossed and the outcome fed to the servomechanism. The 
servomechanism then places the mirror M in the c arm or leaves it outside, as de- 
termined by the outcome of the quantum coin toss. Using the abbreviated notation 
at the end of Sec. 19.2, the corresponding unitary time development from t\ to t0 
can be written in the form 

|M 0 > h* (| M in ) + \M oul ))/V2, (21.14) 

the counterpart of (20.18) for the delayed choice paradox of Ch. 20. Here |Mo) is 
the initial state of the quantum coin, servomechanism, and mirror. The kets | M in ) 
and \M out ) in (21.14) include the mirror and the rest of the apparatus (coin and ser- 
vomechanism), and thus they have a slightly different physical interpretation from 
those in (21.6). However, since the photon dynamics which interests us depends 
only on where the mirror M is located, this distinction makes no difference for 
the present analysis. Combining (21.14) with (21.2) gives an overall unitary time 
development of the photon and the mirror (and associated apparatus) from t\ to ?2 
in the form: 


|lc>|M 0 > {\2c)\M out ) + \2g}\M in ))/V2, ^ ^ 

\ld)\M 0 ) h* \2d)(\M out ) + |M,„>)/V2. 

What is important is the location of the mirror at the time 6.5 when the photon 
interacts with it — assuming both the mirror and the photon are in the c arm of the 
interferometer — and not its location in the initial state |Mo); the latter could be 
either the M,„ or M out position, or someplace else. 
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Let the initial state of the entire system at to be 

l^o> = |Oa>|Mo>|£ ,0 }|F 0 }|G°}, (21.16) 

and let \Qj) be the state which results at time tj from unitary time development. 
We assume that |M 0 ), | M out ), and |M,„) do not change outside the time interval 
where (21.14) and (21.15) apply. At f 3 one has 

m - [(-I3e> + |3/> + V2|3g»|M /n ) +2|3/>|M OHf )] 

(&\E°)\F°)\G°) /2s/2. (21.17) 

We leave to the reader the task of working out |f2 ; ) at other times, a useful exercise 
if one wants to check the properties of the various consistent families described 
below. 

Corresponding to (21.10) there is a family, now based on the single initial state 
£2q, with support 


£2 0 O [15] 


\M ou t O [3/] O F*, 

{ Mi„ O [3s] O {E*,F*,G*}, 


(21.18) 


where |3s) is defined in (21.11). This confirms the fact that whether or not a photon 
arrives at E* depends on the position of the mirror M at the time when the photon 
reaches the corresponding position in the c arm of the interferometer, not on where 
M was at an earlier time, in accordance with what was stated in Sec. 21.1. Suppose 
that the photon has been detected in E*. From (21.18) it is evident that the quantum 
coin toss resulted in M in . What would have happened if, instead, the result had 
been M out 2 If we use [la] at t\ as a pivot, the answer is that the photon would have 
been detected by F. This is reasonable, but as noted in our discussion of (21.10), 
not at all paradoxical, since it is impossible to use this family to discuss whether or 
not the photon was in the d arm. 

The counterpart of (21.12) is the family with support 


[la] O M out O [3/] O F*, 

Qo O [1 c\ O M in © [3g] © G*, (21.19) 

[Id] O M in © [3c] © { E *, F*}. 


Just as in (21.12), the photon states at t\ are contextual; [lc] and [1 d] depend 
on M in , while [la] depends on M out . The only difference is that here the de- 
pendence is on the later, rather than earlier, position of the mirror M. Note once 
again that dependence, understood in the sense defined in Ch. 14, does not refer 
to a physical cause, and there is no reason to think that the future is influencing 
the past — see the discussion in Secs. 20.3 and 20.4. We can use (21.19) to con- 
clude that the detection of the photon by E means that the photon was earlier in 
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the d and not the c arm of the interferometer. However, due to the contextual- 
ity just mentioned, [Id] at t\ cannot serve as a pivot in a counterfactual argument 
which tells what would have happened had M out occurred rather than M in . The 
only pivot available in (21.19) is the initial state £2 0 . But the corresponding coun- 
terfactual assertion is too vague to serve as a satisfactory basis of a paradox, for 
precisely the same reasons given in Sec. 20.4 in connection with the analogous 
(20.23). 

The counterpart of (21.13) is the family with support 


£2 0 O 


[lc] © 

[l d] © 


\M out Q [3c] © S+, 
\ M in © [3g] O G*, 
[M out 0[3d]0 S , 
j M in © [3d] O S~. 


( 21 . 20 ) 


Here [lc] and [Id] are no longer contextual. Also note that in this family there is 
not the slightest evidence of any nonlocal influence by the mirror on the photon: 
the later time development if the photon is in the d arm at t\ is exactly the same 
for M in and for M out . However, (21.20) is clearly not a satisfactory formulation 
of the paradox, despite the fact that [Id] at t\ can serve as a pivot. Among other 
things, E* does not appear at t 4 . This can be remedied in part by replacing the 
fourth history in (21.20) with the two histories 


O [Id] O Mi„ © [3d] © {£*, F*}. (21.21) 


The resulting family, now supported on five histories of nonzero weight, remains 
consistent. But E* is a contextual event dependent on M,„, and if we use this 
family, E* makes no sense in the case M out . Thus, as noted above in connection 
with (21.13), we cannot when using this family employ E* as evidence that the 
mirror was in rather than outside of the c arm. In addition to the difficulty just 
mentioned, (21.20) has MQS states at t\. While one can modify the fourth history 
by replacing it with (21.21), the same remedy will not work in the other two cases, 
for it would violate the consistency conditions. 

Let us summarize what we have learned from considering a situation in which 
a quantum coin toss at a time when the photon is already inside the interferometer 
determines whether or not the c arm will be blocked by the mirror M. For the 
photon to later be detected by E, it is necessary that M be in the c arm at the time 
when the photon arrives at this point, and in this respect E* does, indeed, provide 
a (partial) measurement indicating M in is the case rather than M out . However, the 
attempt to infer from this that there is some sort of nonlocal influence between M 
and the photon fails, for reasons which are quite similar to those summarized at the 
end of Sec. 21.3: one needs to find a consistent family in which the photon is in 
the d arm both for M in and for M out . This is obviously not the case for (21.18) and 
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(21.19), whereas (21.20) — with or without the fourth line replaced with (21.21) 
— is unsatisfactory because of the states which appear at < 4 . Thus by the time one 
has constructed a family in which [1 d] at t\ can serve as a pivot, the counterfac- 
tual analysis runs into difficulty because of what happens at later times. Just as in 
Sec. 21.3, one can construct various pieces of a paradox by using different consis- 
tent families. But the fact that these families are mutually incompatible prevents 
putting the pieces together to complete the paradox. 


21.5 Interaction-free measurement? 

It is sometimes claimed that the determination of whether M is blocking the c arm 
by means of a photon detected in E is an “interaction-free measurement”: The 
photon did not actually interact with the mirror, but nonetheless provided informa- 
tion about its location. The term “interact with” is not easy to define in quantum 
theory, and we will want to discuss two somewhat different reasons why one might 
suppose that such an indirect measurement involves no interaction. The first is 
based on the idea that detection by E implies that the photon was earlier in the 
d arm of the interferometer, and thus far from the mirror and unable to interact 
with it — unless, of course, one believes in the existence of some mysterious long- 
range interaction. The second comes from noting that when it is in the c arm, 
Fig. 21.1(a), the mirror M is oriented in such a way that any photon hitting it 
will later be detected by G. Obviously a photon detected by E was not detected 
by G, and thus, according to this argument, could not have interacted with M. 
The consistent families introduced earlier are useful for discussing both of these 
ideas. 

Let us begin with (21.10), or its counterpart (21.18) if a quantum coin is used. In 
these f ami lies the time development of the photon state is given by unitary trans- 
formations until it has been detected. As one would expect, the photon state is 
different, at times t 2 and later, depending upon whether M is in or out of the c 
arm. Hence if unitary time development reflects the presence or absence of some 
interaction, these families clearly do not support the idea that during the process 
which eventually results in E* the photon does not interact with M. Indeed, one 
comes to precisely the opposite conclusion. 

Suppose one considers families of histories in which the photon state evolves in 
a stochastic, rather than a unitary, fashion preceding the final detection. Are the 
associated probabilities affected by the presence or absence of M in the c arm? In 
particular, can one find cases in which certain probabilities are the same for both 
M in and M out 7 Neither (21.12) nor its quantum coin counterpart (21.19) provide 
examples of such invariant probabilities, but ( 21 . 20 ) does supply an example: if 
the photon is in the d arm at t\, then it will certainly be in the superposition state 
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[3 d] just after leaving the interferometer, and at a slightly later time the detector 
system will be in the MQS state S~. (One would have the same thing in (21.13) if 
the last two histories were collapsed into a single history representing unitary time 
development after t\, ending in S~ at t 4 .) So in this case we have grounds to say 
that there was no interaction between the photon and the mirror if the photon was 
in the d arm at t\. However, (21.20), for reasons noted in Sec. 21.4, cannot be used 
if one wants to speak of photon detection by E as representing a measurement of 
M,„ as against M out . Thus we have found a case which is “interaction- free”, but it 
cannot be called a “measurement”. 

Finally, let us consider the argument for noninteraction based upon the idea that, 
had it interacted with the mirror, the photon would surely have been scattered into 
channel g to be detected by G. This argument would be plausible if we could be 
sure that the photon was in or not in the c arm of the interferometer at the time when 
it (might have) interacted with M. However, if the photon was in a superposition 
state at the relevant time, as is the case in the families (21.10) and (21.18), the 
argument is no longer compelling. Indeed, one could say that the M,„ histories in 
these families provide a counterexample showing that when a quantum particle is 
in a delocalized state, a local interaction can produce effects which are contrary to 
the sort of intuition one builds up by using examples in classical physics, where 
particles always have well-defined positions. 

In conclusion, there seems to be no point of view from which one can justify 
the term “interaction-free measurement”. The one that comes closest might be that 
based on the family (21.20), in which the photon can be said to be definitely in 
the c or d arm of the interferometer, and when in the d arm it is not influenced by 
whether M is or is not in the c arm. But while this family can be used to argue for 
the absence of any mysterious long-range influences of the mirror on the photon, it 
is incompatible with using detection of the photon by E as a measurement of M in 
in contrast to M out . 

It is worthwhile comparing the indirect measurement situation considered in 
this chapter with a different type of “interaction-free” measurement discussed in 
Sec. 12.2 and in Secs. 18.1 and 18.2: A particle (photon or neutron) passes through 
a beam splitter, and because it is not detected by a detector in one of the two out- 
put channels, one can infer that it left the beam splitter through the other channel. 
In this situation there actually is a consistent family, see (12.31) or the analogous 
(18.7), containing the measurement outcomes, and in which the particle is far away 
from the detector in the case in which it is not detected. Thus one might have some 
justification for referring to this as “interaction-free”. However, since such a situ- 
ation can be understood quite simply in classical terms, and because “interaction- 
free” has generally been associated with confused ideas of wave function collapse, 
see Sec. 18.2, even in this case the term is probably not very helpful. 



21.6 Conclusion 


295 


21.6 Conclusion 

The paradox stated in Sec. 21.1 was analyzed by assuming, in Sec. 21.3, that the 
mirror positions M in and M out are specified at the initial time to, before the photon 
enters the interferometer, and then in Sec. 21.4 by assuming these positions are 
determined by a quantum coin toss which takes place when the photon is already 
inside the interferometer. Both analyses use several consistent families, and come 
to basically similar conclusions. In particular, while various parts of the argument 
leading to the paradoxical result — e.g., the conclusion that detection by E means 
the photon was earlier in the d arm of the interferometer — can be supported by 
choosing an appropriate framework, it is not possible to put all the pieces together 
within a single consistent family. Thus the reasoning which leads to the paradox, 
when restated in a way which makes it precise, violates the single-framework rule. 

This indicates a fourth lesson on how to analyze quantum paradoxes, which can 
be added to the three in Sec. 20.5. Very often quantum paradoxes rely on reasoning 
which violates the single-framework rule. Sometimes such a violation is already 
evident in the way in which a paradox is stated, but in other instances it is more 
subtle, and analyzing several different frameworks may be necessary in order to 
discover where the difficulty lies. 

The idea of a mysterious nonlocal influence of the position of mirror M ( M,„ vs. 
M out ) on the photon when the latter is far away from M in the d arm of the inter- 
ferometer is not supported by a consistent quantum analysis. In the family (21.20) 
the absence of any influence is quite explicit. In the family (21.19) the fact that 
the photon states inside the interferometer are contextual events indicates that the 
difference between the photon states arises not from some physical influence of the 
mirror position, but rather from the physicist’s choice of one form of description 
rather than another. (We found a very similar sort of “influence” of B in and B out 
in the delayed choice paradox of Ch. 20, and the remarks made there in Secs. 20.3 
and 20.4 also apply to the indirect measurement paradox.) It is, of course, impor- 
tant to distinguish differences arising simply because one employs a different way 
of describing a situation from those which come about due to genuine physical 
influences. 
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22.1 Simultaneous values 

There is never any difficulty in supposing that a classical mechanical system pos- 
sesses, at a particular instant of time, precise values of each of its physical vari- 
ables: momentum, kinetic energy, potential energy of interaction between particles 
3 and 5, etc. Physical variables, see Sec. 5.5, correspond to real-valued functions 
on the classical phase space, and if at some time the system is described by a point 
y in this space, the variable A has the value A(y), B has the value B(y), etc. 

In quantum theory, where physical variables correspond to observables, that is, 
Hermitian operators on the Hilbert space, the situation is very different. As dis- 
cussed in Sec. 5.5, a physical variable A has the value aj provided the quantum 
system is in an eigenstate of A with eigenvalue aj or, more generally, if the system 
has a property represented by a nonzero projector P such that 

AP = ajP. (22.1) 

It is very often the case that two quantum observables have no eigenvectors in 
common, and in this situation it is impossible to assign values to both of them 
for a single quantum system at a single instant of time. This is the case for S x 
and S z for a spin-half particle, and as was pointed out in Sec. 4.6, even the as- 
sumption that “S z — 1/2 AND S x = 1 /2” is a false (rather than a meaningless) 
statement is enough to generate a paradox if one uses the usual rules of classi- 
cal logic. This is perhaps the simplest example of an incompatibility paradox 
arising out of the assumption that quantum properties behave in much the same 
way as classical properties, so that one can ignore the rules of quantum reasoning 
summarized in Ch. 16, in particular the rule which forbids combining incompat- 
ible properties and families. By contrast, if two Hermitian operators A and B 
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commute, there is at least one orthonormal basis { 1 7 } } , Sec. 3.7, in which both are 
diagonal, 


A = J2 <*j \Ml I - B = Y, ^ 1 /) (j I • (22.2) 

j i 

If the quantum system is described by this framework, there is no difficulty with 
supposing that A has (to take an example) the value a 2 at the same time as B has 
the value b 2 . 

The idea that all quantum variables should simultaneously possess values, as in 
classical mechanics, has a certain intuitive appeal, and one can ask whether there 
is not some way to extend the usual Hilbert space description of quantum mechan- 
ics, perhaps by the addition of some hidden variables, in order to allow for this 
possibility. For this to be an extension rather than a completely new theory, one 
needs to place some restrictions upon which values will be allowed, and the fol- 
lowing are reasonable requirements: (i) The value assigned to a particular observ- 
able will always be one of its eigenvalues, (ii) Given a collection of commuting 
observables, the values assigned to them will be eigenvalues corresponding to a 
single eigenvector. For example, with reference to A and B in (22.2), assigning 
a 2 to A and b 2 to B is a possibility, but assigning a 2 to A and b 3 to B (assuming 
a 2 / and b 2 / b 3 ) is not. That condition (ii) is reasonable if one intends to 
assign values to all observables can be seen by noting that the projector | 2 )( 2 | in 
(22.2) is itself an observable with eigenvalues 0 and 1. If it is assigned the value 
1, then it seems plausible that A should be assigned the value a 2 and B the value 
b 2 . 

Bell and Kochen and Specker have shown that in a Hilbert space of dimension 
3 or more, assigning values to all quantum observables in accordance with (i) and 
(ii) is not possible. In Sec. 22.3 we shall present a simple example due to Mer- 
min which shows that such a value assignment is not possible in a Hilbert space 
of dimension 4 or more. Such a counterexample is a paradox in the sense that it 
represents a situation that is surprising and counterintuitive from the perspective of 
classical physics. Section 22.2 is devoted to introducing the notion of a value func- 
tional, a concept which is useful for discussing the two-spin paradox of Sec. 22.3. 
A truth functional, Sec. 22.4, is a special case of a value functional, and is useful 
for understanding how the concept of “truth” is used in quantum descriptions. The 
three-box paradox in Sec. 22.5 employs incompatible frameworks of histories in a 
manner similar to the way in which the two-spin paradox uses incompatible frame- 
works of properties at one time, and Sec. 22.6 extends the results of Sec. 22.4 on 
truth functionals to the case of histories. 
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22.2 Value functionals 


A value functional v assigns to all members A, B. .. . of some collection C of 
physical variables numerical values of a sort which could be appropriate for de- 
scribing a single system at a single instant of time. For example, with y a fixed 
point in the classical phase space, the value functional v Y assigns to each physical 
variable C the value 


v y (C) = C(y) (22.3) 

of the corresponding function at the point y. In this case C could be the collection 
of all physical variables, or some more restricted set. If there is some algebraic 
relationship among certain physical variables, as in the formula 

E = p 1 12m + V (22.4) 

for the total energy in terms of the momentum and potential energy of a particle in 
one dimension, this relationship will also be satisfied by the values assigned by v Y : 

v y (E) = \v Y (p)f/2m + v y (V). (22.5) 

To define a value functional for a quantum system, let { Dj } be some fixed de- 

composition of the identity, and let the collection C consist of all operators of the 
form 

C = J2 c J D t’ (22.6) 

j 

with real eigenvalues cj. The value functional Vk defined by 

v k {C) = c k (22.7) 

assigns to each physical variable C its value on the subspace D k . Note that there 
are as many distinct value functionals as there are members in the decomposition 
{Dj}. As in the classical case, if there is some algebraic relationship among the 
observables belonging to C, such as 

F — 21 — A + B 2 , (22.8) 

it will be reflected in the values assigned by Vk : 

v k (F) = 2- v k (A ) + [v k (B){ 2 . (22.9) 

It is important to note that the class C on which a quantum value functional is 
defined is a collection of commuting observables, since the decomposition of the 
identity is held fixed and only the eigenvalues in (22.6) are allowed to vary. Con- 
versely, given a collection of commuting observables, one can find an orthonormal 
basis in which they are simultaneously diagonal, Sec. 3.7, and the corresponding 
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decomposition of the identity can be used to define value functionals which assign 
values simultaneously to all of the observables in the collection. 

The problem posed in Sec. 22. 1 of defining values for all quantum observables 
can be formulated as follows: Find a universal value functional v u defined on the 
collection of all observables or Hermitian operators on a quantum Hilbert space, 
and satisfying the conditions: 

Ul. For any observable A, v u (A ) is one of its eigenvalues. 

U2. Given any decomposition of the identity { Dj }, with C the corresponding 
collection of observables of the form (22.6), there is some D k from the 
decomposition such that 


v u (C) = c k (22.10) 

for every C in C, where c k is the coefficient in (22.6). 

Conditions U 1 and U2 are the counterparts of the requirements (i) and (ii) stated 
in Sec. 22.1. Note that any algebraic relationship, such as (22.8), among the mem- 
bers of a collection of commuting observables will be reflected in the values as- 
signed to them by v u , as in (22.9). The reason is that there will be an orthonormal 
basis in which these observables are simultaneously diagonal, and (22.10) will hold 
for the corresponding decomposition of the identity. 


22.3 Paradox of two spins 

There are various examples which show explicitly that a universal value functional 
satisfying conditions Ul and U2 in Sec. 22.2 cannot exist. One of the simplest is 
the following two-spin paradox due to Mermin. For a spin-half particle let cr x , a y , 
and cr z be the operators 2 S x , 2 S y , and 2 S z , with eigenvalues ±1. The corresponding 
matrices using a basis of |z + ) and |z“) are the familiar Pauli matrices: 


/0 l\ 

/0 -A 

(l 0 \ 

(l oj* 

^ (i oj’ 

■Ho -i 


( 22 . 11 ) 


The Hilbert space for two spin-half particles a and b is the tensor product 




( 22 . 12 ) 


and we define the corresponding spin operators as 

cr ax — o x ®I , Ob y — /® cr y , (22.13) 


etc. 
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The nine operators on H in the 3 x 3 square 

G ax a bx G ax a bx 

G by G ay G ay G by (22.14) 

G ax G by G ay G bx G az G bz 

have the following properties: 

M 1 . Each operator is Hermitian, with two eigenvalues equal to + 1 and two equal 
to —1. 

M2. The three operators in each row commute with each other, and likewise the 
three operators in each column. 

M3. The product of the three operators in each row is equal to the identity 7. 
M4. The product of the three operators in both of the first two columns is 7, 
while the product of those in the last column is —7. 

These statements can be verified by using the well-known properties of the Pauli 
matrices: 

(cr x ) 2 — 7, G x°y — i G z< (22.15) 

etc. Note that cr ax and a by commute with each other, as they are defined on separate 
factors in the tensor product, see (22. 13), whereas cr ax and o ay do not commute with 
each other. Statement Ml is obvious when one notes that the trace of each of the 
nine operators in (22.14) is 0, whereas its square is equal to 7. 

A universal value functional v u will assign one of its eigenvalues, +1 or — 1, 
to each of the nine observables in (22.14). Since the product of the operators in 
the first row is 7 and an assignment of values preserves algebraic relations among 
commuting observables, as in (22.9), it must be the case that 

v u ( G ax) V u {a bx ) v u (a ax cr bx ) = 1 . (22. 16) 

The products of the values in the other rows and in the first two columns is also 1, 
whereas for the last column the product is 

Vu ( G ax G b x ) V u ( (T ay (T b y ) v u (cr a z G bz) = - 1 • (22. 17) 

The set of values 

-1 -1 +1 

+1 -1 -1 (22.18) 

-1 -1 +1 

for the nine observables in (22.14) satisfies (22.16), (22.17), and all of the other 
product conditions except that the product of the integers in the center column is 
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— 1 rather than +1. This seems like a small defect, but there is no obvious way to 
remedy it, since changing any —1 in this column to +1 will result in a violation 
of the product condition for the corresponding row. In fact, a value assignment si- 
multaneously satisfying all six product conditions is impossible, because the three 
product conditions for the rows imply that the product of all nine numbers is +1, 
while the three product conditions for the columns imply that this same product 
must be —1, an obvious contradiction. To be sure, we have only looked at a rather 
special collection of observables in (22.14), but this is enough to show that there is 
no universal value functional capable of assigning values to every observable in a 
manner which satisfies conditions U1 and U2 of Sec. 22.2. 

It should be emphasized that the two-spin paradox is not a paradox for quantum 
mechanics as such, because quantum theory provides no mechanism for assigning 
values simultaneously to noncommuting observables (except for special cases in 
which they happen to have a common eigenvector). Each of the nine observables 
in (22.14) commutes with four others: two in the same row, and two in the same 
column. However, it does not commute with the other four observables. Hence 
there is no reason to expect that a single value functional can assign sensible values 
to all nine, and indeed it cannot. The motivation for thinking that such a function 
might exist comes from the analogy provided by classical mechanics, as noted in 
Sec. 22. 1 . What the two-spin paradox shows is that at least in this respect there is 
a profound difference between quantum and classical physics. 

This example shows that a universal value functional is not possible in a four- 
dimensional Hilbert space, or in any Hilbert space of higher dimension, since one 
could set up the same example in a four-dimensional subspace of the larger space. 
The simplest known examples showing that universal value functionals are impos- 
sible in a three-dimensional Hilbert space are much more complicated. Universal 
value functionals are possible in a two-dimensional Hilbert space, a fact of no par- 
ticular physical significance, since very little quantum theory can be carried out if 
one is limited to such a space. 


22.4 Truth functionals 

Additional insight into the difference between classical and quantum physics 
comes from considering truth functionals. A truth functional is a value functional 
defined on a collection of indicators (in the classical case) or projectors (in the 
quantum case), rather than on a more general collection of physical variables or ob- 
servables. A classical truth functional 9 y can be defined by choosing a fixed point 
y in the phase space, and then for every indicator P belonging to some collection C 
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writing 


6 y (P) = Ply), 


(22.19) 


which is the same as (22.3). Since an indicator can only take the values 0 or 1, 
6 r (P) will either be 1, signifying that system in the state y possesses the property 
P, and thus that P is true; or 0, indicating that P is false. (Recall that the indicator 
for a classical property, (4.1), takes the value 1 on the set of points in the phase 
space where the system has this property, and 0 elsewhere.) 

A quantum truth functional is defined on a Boolean algebra C of projectors of 
the type 

P = J2 7t jDj, ( 22 . 20 ) 

j 

where each nj is either 0 or 1, and {Dj} is a decomposition of the identity. It has 
the form 


e k (P) = 


if PD k = D k , 
if PD k = 0, 


( 22 . 21 ) 


for some choice of k. This is a special case of (22.7), with (22.20) and n k playing 
the role of (22.6) and c k . If one thinks of the decomposition { Dj } as a sample 
space of mutually exclusive events, one and only one of which occurs, then the 
truth functional 9 k assigns the value 1 to all properties P which are true, in the 
sense that Pr ( P \ D k ) — 1, when D k is the event which actually occurs, and 0 
to all properties P which are false, in the sense that Pr(P | D k ) = 0. Thus as 
long as one only considers a single decomposition of the identity the situation is 
analogous to the classical case: the projectors in {Dj} constitute what is, in effect, 
a discrete phase space. The difference between classical and quantum physics lies 
in the fact that 0 Y in (22.19) can be applied to as large a collection of indicators 
as one pleases, whereas the definition Q k in (22.21) will not work for an arbitrary 
collection of projectors; in particular, if P does not commute with D k , PD k is not 
a projector. 

For a given decomposition {Dj} of the identity, the truth functional 6 k is simply 
the value functional v k of (22.7) restricted to projectors belonging to C rather than 
to more general operators; that is, 


e k (P) = v k iP) 


( 22 . 22 ) 


for all P in C. Conversely, v k is determined by 6 k in the sense that for any operator 
C of the form (22.6) one has 

VkiP) - J2 c A(Dj). 


(22.23) 
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An alternative approach to defining a truth functional is the following. Let 9(P) 
assign the value 0 or 1 to every projector in the Boolean algebra C generated by 
the decomposition of the identity {D J }, subject to the following conditions: 

9(1) = 1, 

9(1 - P) = 1 -9(P), (22.24) 

9(P Q) — 9(P)9(Q). 

One can think of these as a special case of a value functional preserving algebraic 
relations, as discussed in Sec. 22.2. Thus it is evident that 9k as defined in (22.21), 
since it is derived from a value functional, (22.22), will satisfy (22.24). It can also 
be shown that a functional 9 taking the values 0 and 1 and satisfying (22.24) must 
be of the form (22.21) for some k. 

We shall define a universal truth functional to be a functional 9 U which assigns 
0 or 1 to every projector P on the Hilbert space, not simply those associated with a 
particular Boolean algebra C, in such a way that the relations in (22.24) are satisfied 
whenever they make sense. In particular, the third relation in (22.24) makes no 
sense if P and Q do not commute, for then PQ is not a projector, so we modify it 
to read: 


0 u (PQ) - 0 U (P)9 U (Q ) if PQ = QP. (22.25) 

When P and Q both belong to the same Boolean algebra they commute with each 
other, so 9 U when restricted to a particular Boolean algebra C satisfies (22.24). 
Consequently, when 9 U is thought of as a function on the projectors in C, it co- 
incides with an “ordinary” truth functional 9k for this algebra, for some choice of 
k. 

Given a universal value functional v u , we can define a corresponding universal 
truth functional 9 U by letting 9 U (P) — v u (P) for every projector P. Conversely, 
given a universal truth functional one can use it to construct a universal value func- 
tional satisfying conditions U1 and U2 of Sec. 22.2 by using the counterpart of 
(22.23): 

Vu(C) = J2 c M D j ). (22.26) 

j 

That is, given any Hermitian operator C there is a decomposition of the identity 
{ Dj } such that C can be written in the form (22.6). On this decomposition of the 
identity 9 U must agree with 9k for some k, so the right side of (22.23) makes sense, 
and can be used to define v u (C). It then follows that U1 and U2 of Sec. 22.2 
are satisfied. If the eigenvalues of C are degenerate there is more than one way of 
writing it in the form (22.6), but it can be shown that the properties we are assuming 
for 9 U imply that these different possibilities lead to the same v u (C). 



304 Incompatibility paradoxes 

This close connection between universal value functionals and universal truth 
functionals means that arguments for the existence or nonexistence of one imme- 
diately apply to the other. Thus neither of these universal functionals can be con- 
structed in a Hilbert space of dimension 3 or more, and the two-spin paradox of 
Sec. 22.3, while formulated in terms of a universal value functional, also demon- 
strates the nonexistence of a universal truth functional in a four (or higher) dimen- 
sional Hilbert space. It is, indeed, somewhat disappointing that there is nothing 
very significant to which the formulas (22.25) and (22.26) actually apply! 

The nonexistence of universal quantum truth functionals is not very surprising. 
It is simply another manifestation of the fact that quantum incompatibility makes 
it impossible to extend certain ideas associated with the classical notion of truth 
into the quantum domain. Similar problems were discussed earlier in Sec. 4.6 
in connection with incompatible properties, and in Sec. 16.4 in connection with 
incompatible frameworks. 


22.5 Paradox of three boxes 

The three-box paradox of Aharonov and Vaidman resembles the two-spin paradox 
of Sec. 22.3 in that it is a relatively simple example which is incompatible with the 
existence of a universal truth functional. Whereas the two- spin paradox refers to 
properties of a quantum system at a single instant of time, the three-box paradox 
employs histories, and the incompatibility of the different frameworks reflects a 
violation of consistency conditions rather than the fact that projectors do not com- 
mute with each other. The paradox is discussed in this section, and the connection 
with truth functionals for histories is worked out in Sec. 22.6. 

Consider a three-dimensional Hilbert space spanned by an orthonormal basis 
consisting of three states |A), |5), and | C). As in the original statement of the 
paradox, we shall think of these states as corresponding to a particle being in one 
of three separate boxes, though one could equally well suppose that they are three 
orthogonal states of a spin-one particle, or the states m — —1, 0, and 1 in a toy 
model of the type introduced in Sec. 2.5. The dynamics is trivial, T(t', t) — I: if 
the particle is in one of the boxes, it stays there. We shall be interested in quantum 
histories involving three times to < t\ < t 2 , based upon an initial state 

|D> = (|A> + |B> + |C>)/V3 (22.27) 

at to, and ending at f 2 in one of the two events F or F = I — F, where F is the 
projector corresponding to 

I F) = (| A) + | B) - |C>)/V3. (22.28) 

In the first consistent family A the events at the intermediate time t\ are A and 
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A — I — A, with A the projector | A) (A|. The support of this family consists of the 
three histories 


D © 


AO F, 
AO F, 
AO F, 


(22.29) 


since D O AQ F has zero weight. Checking consistency is straightforward. The 
chain operator for the first history in (22.29) is obviously orthogonal to the other 
two because of the final states. The orthogonality of the chain operators for the 
second and third histories can be worked out using chain kets, or by replacing 
the final F with F and employing the trick discussed in connection with (11.5). 
Because DO AQF has zero weight, if the event F occurs at t 2 , then A rather than 
A must have been the case at t \ ; that is, A at t\ is never followed by F at t 2 . Thus 
one has 


Pr(Ai | D 0 A F 2 ) = 1, 


(22.30) 


with our usual convention of a subscript indicating the time of an event. 

Now consider a second consistent family B with events B and B = I — B at t \ ; 
B is the projector \B){B\. In this case the support consists of the histories 


DO 


B O F, 
B O F, 
B O F, 


(22.31) 


from which one can deduce that 


Pr(Bi | D 0 A F 2 ) — 1, (22.32) 

the obvious counterpart of (22.30) given the symmetry between | A) and | B) in the 
definition of | D) and | F) . 

The paradox arises from noting that from the same initial data D and F (“ini- 
tial” refers to position in a logical argument, not temporal order in a history; see 
Sec. 16.1) one is able to infer by using A that A occurred at time t\, and by using 
B that B was the case at t \ . However, A and B are mutually exclusive properties, 
since BA — 0. That is, we seem to be able to conclude with probability 1 that the 
particle was in box A, and also that it was in box B, despite the fact that the rules 
of quantum theory indicate that it cannot simultaneously be in both boxes ! Thus it 
looks as if the rules of quantum reasoning have given rise to a contradiction. 

However, these rules, as summarized in Ch. 16, require that both the initial data 
and the conclusions be embedded in a single framework, whereas we have em- 
ployed two different consistent families, A and B. In addition, in order to reach a 
contradiction we used the assertion that A and B are mutually exclusive, and this 
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requires a third framework C, since A does not include B and B does not include A 
at f i . If the frameworks A, B, and C were compatible with each other, as is always 
the case in classical physics, there would be no problem, for the inferences carried 
out in the separate frameworks would be equally valid in the common refinement. 
But, as we shall show, these frameworks are mutually incompatible, despite the 
fact that the history projectors commute with one another. 

Any common refinement of A and B would have to contain, among other things, 
the first history in (22.29) and the first one in (22.31): 

DQAQF, DOB OF. (22.33) 


The product of these two history projectors is zero, since AB — 0, but the chain 
operators are not orthogonal to each other. If one works out the chain kets one 
finds that they are both equal to a nonzero constant times | F). Thus having the two 
histories in (22.33) in the same family will violate the consistency conditions. A 
convenient choice for C is the family whose support is the three histories 


DO 


AO I, 
BO I, 
COI. 


(22.34) 


Note that this is, in effect, a family of histories defined at only two times, to and t\, 
as I provides no information about what is going on at f 2 , and for this reason it is 
automatically consistent, Sec. 11.3. It is incompatible with A because a common 
refinement would have to include the two histories 


DQAQF, DQBOL (22.35) 

whose projectors are orthogonal, but whose chain kets are not, and it is likewise 
incompatible with B. 

Thus the paradox arises because of reasoning in a way which violates the single- 
framework rule, and in this respect it resembles the two-spin paradox of Sec. 22.3. 
An important difference is that the incompatibility between frameworks in the case 
of two spins results from the fact that some of the nine operators in (22.14) do not 
commute with each other, whereas in the three-box paradox the projectors for the 
histories commute with each other, and incompatibility arises because the consis- 
tency conditions are not fulfilled in a common refinement. 

Rewording the paradox in a slightly different way may assist in understanding 
why some types of inference which seem quite straightforward in terms of ordinary 
reasoning are not valid in quantum theory. Let us suppose that we have used the 
family A together with the initial data of D at to and F at t 2 to reach the conclusion 
that at time t\ A is true and A — I — A is false. Since A — B + C, it seems natural 
to conclude that both B and C are false, contradicting the result (from framework 
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B) that B is true. The step from the falsity of A to the falsity of B as a consequence 
of A — B + C would be justified in classical mechanics by the following rule: If 
P = Q + R is an indicator which is the sum of two other indicators, and P is 
false, meaning P(y) — 0 for the phase point y describing the physical system, 
then Q(y) — 0 and R(y) — 0, so both Q and R are false. For example, if the 
energy of a system is not in the range between 10 and 20 J, then it is not between 
10 and 15 J, nor is it between 15 and 20 J. 

The corresponding rule in quantum physics states that if the projector P is the 
sum of two projectors Q and R, and P is known to be false, then if Q and R 
are part of the Boolean algebra of properties entering into the logical discussion, 
both Q and R are false. The words in italics apply equally to the case of clas- 
sical reasoning, but they are usually ignored, because if Q is not among the list 
of properties available for discussion, it can always be added, and R = P — Q 
added at the same time to ensure that the properties form a Boolean algebra. In 
classical physics there is never any problem with adding a property which has not 
previously come up in the discussion, and therefore the rule in italics can safely be 
relegated to the dusty books on formal logic which scientists put off reading until 
after they retire. However, in quantum theory it is by no means the case that Q 
(and therefore R — P — Q) can always be added to the list of properties or events 
under discussion, and this is why the words in italics are extremely important. If 
by using the family A we have come to the conclusion that A = B + C is false, 
and, as is in fact the case, B at t\ cannot be added to this family while maintaining 
consistency, then B has to be regarded as meaningless from the point of view of 
the discussion based upon A, and something which is meaningless cannot be either 
true or false. 

One can also think about it as follows. The physicist who first uses the initial 
data and framework A to conclude that A was true at t\ , and then inserts B at t\ into 
the discussion has, in effect, changed the framework to something other than A. In 
classical physics such a change in framework causes no problems, and it certainly 
does not alter the correctness of a conclusion reached earlier in a framework which 
made no mention of B. But in the quantum case, adding B means that something 
else must be changed in order to ensure that one still has a consistent framework. 
Since A occurs at the end of the previous step of the argument and is thus still at 
the center of attention, the physicist who introduces B is unconsciously (which is 
what makes the move so dangerous!) shifting to a framework, such as (22.34), 
in which either D at to or F at t 2 has been forgotten. But as the new framework 
does not include the initial data, it is no longer possible to derive the truth of A. 
Hence adding B to the discussion in this manner is, relative to the truth of A, rather 
like sawing off the branch on which one is seated, and the whole argument comes 
crashing to the ground. 
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22.6 Truth functionals for histories 


The notion of a truth functional can be applied to histories as well as to properties 
of a quantum system at a single time, and makes perfectly good sense as long 
as one considers a single framework or consistent family, based upon a sample 
space consisting of some decomposition of the history identity into elementary 
histories, as discussed in Sec. 8.5. Given this framework, one and only one of its 
elementary histories will actually occur, or be true, for a single quantum system 
during a particular time interval or run. A truth functional is then a function which 
assigns 1 (true) to a particular elementary history, 0 (false) to the other elementary 
histories, and 1 or 0 to other members of the Boolean algebra of histories using 
a formula which is the obvious analog of (22.21). The number of distinct truth 
functionals will typically be less than the number of elementary histories, since 
one need not count histories with zero weight — they are dynamically impossible, 
so they never occur — and certain elementary histories will be excluded by the 
initial data, such as an initial state. 

A universal truth functional 0 U for histories can be defined in a manner analogous 
to a universal truth functional for properties, Sec. 22.4. We assume that 0 U assigns 
a value, 1 or 0, to every projector representing a history which is not intrinsically 
inconsistent (Sec. 11.8), that is, any history which is a member of at least one con- 
sistent family, and that this assignment satisfies the first two conditions of (22.24) 
and the third condition whenever it makes sense. That is, (22.25) should hold when 
P and Q are two histories belonging to the same consistent family (which implies, 
among other things, that P Q = QP). For the purposes of the following discussion 
it will be convenient to denote by T the collection of all true histories, the histories 
to which 0 U assigns the value 1 . Given that 0 U satisfies these conditions, it is not 
hard to see that when it is restricted to a particular consistent family or framework 
T, that is, regarded as a function on the histories belonging to this family, it will 
coincide with one of the “ordinary” truth functionals for this family, and therefore 
TflT, the subset of all true histories belonging to T, will consist of one elemen- 
tary history and all compound histories which contain this particular elementary 
history. In particular, 6„ can never assign the value 1 to two distinct elementary 
histories belonging to the same framework. 

Since a decomposition of the identity at a single time is an example, albeit a 
rather trivial one, of a consistent family of one-time “histories”, it follows that there 
can be no truly universal truth functional for histories of a quantum system whose 
Hilbert space is of dimension 3 or more. Nonetheless, it interesting to see how the 
three-box paradox of the previous section provides an explicit example, with non- 
trivial histories, of a circumstance in which there is no universal truth functional. 
Imagine it as an experiment which is repeated many times, always starting with the 
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same initial state D. The universal truth functional and the corresponding list T of 
true histories will vary from one run to the next, since different histories will occur 
in different runs. Think of a run (it will occur with a probability of 1/9) in which 
the final state is F, so the history D Q I Q F is true, and therefore an element of T. 
What other histories belong to T, and are thus assigned the value 1 by the universal 
truth functional 0 U ? 

Consider the consistent family A whose sample space is shown in (22.29), aside 
from histories of zero weight to which 6 U will always assign the value 0. One and 
only one of these histories must be true, so it is the history D Q A Q F, as the other 
two terminate in F. From this we can conclude, using the counterpart of (22.21), 
that DQAQI, which belongs to the Boolean algebra of A, is true, a member of T. 
Following the same line of reasoning for the consistent family B, we conclude that 
DQB OF and DQBQl are elements of T. But now consider the consistent family 
C with sample space (22.34). One and only one of these three elementary histories 
can belong to T, and this contradicts the conclusion we reached previously using 
A and B, that both DQAQI and DQBQl belong to T . 

Our analysis does not by itself rule out the possibility of a universal truth func- 
tional which assigns the value 0 to the history D Q I Q F, and could be used in 
a run in which D Q I Q F occurs. But it shows that the concept can, at best, be 
of rather limited utility in the quantum domain, despite the fact that it works with- 
out any difficulty in classical physics. Note that quantum truth functionals form a 
perfectly valid procedure for analyzing histories (and properties at a single time) 
as long as one restricts one’s attention to a single framework, a single consistent 
family. With this restriction, quantum truth as it is embodied in a truth functional 
behaves in much the same way as classical truth. It is only when one tries to ex- 
tend this concept of truth to something which applies simultaneously to different 
incompatible frameworks that problems arise. 
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23.1 Introduction 

This and the following chapter can be thought of as a single unit devoted to dis- 
cussing various issues raised by a famous paper published by Einstein, Podolsky, 
and Rosen in 1935, in which they claimed to show that quantum mechanics, as it 
was understood at that time, was an incomplete theory. In particular, they asserted 
that a quantum wave function cannot provide a complete description of a quantum 
system. What they were concerned with was the problem of assigning simultane- 
ous values to noncommuting operators, a topic which has already been discussed 
to some extent in Ch. 22. Their strategy was to consider an entangled state (see the 
definition in Sec. 6.2) of two spatially separated systems, and they argued that by 
carrying out a measurement on one system it was possible to determine a property 
of the other. 

A simple example of an entangled state of spatially separated systems involves 
the spin degrees of freedom of two spin-half particles that are in different regions 
of space. In 1951 Bohm pointed out that the claim of the Einstein, Podolsky, and 
Rosen paper, commonly referred to as EPR, could be formulated in a simple way 
in terms of a singlet state of two spins, as defined in (23.2) below. Much of the 
subsequent discussion of the EPR problem has followed Bohm’s lead, and that is 
the approach adopted in this and the following chapter. In this chapter we shall 
discuss various histories for two spin-half particles initially in a singlet state, and 
pay particular attention to the statistical correlations between the two spins. The 
basic correlation function which enters many discussions of the EPR problem is 
evaluated in Sec. 23.2 using histories involving just two times. A number of fami- 
lies of histories involving three times are considered in Sec. 23.3, while Sec. 23.4 
discusses what happens when a spin measurement is carried out on one particle, 
and Sec. 23.5 the case of measurements of both particles. 
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The results found in this chapter may seem a bit dull and repetitious, and the 
reader who finds them so should skip ahead to the next chapter where the EPR 
problem itself, in Bohm’s formulation, is stated in Sec. 24.1 in the form of a para- 
dox, and the paradox is explored using various results derived in the present chap- 
ter. An alternative way of looking at the paradox using counterfactuals is discussed 
in Sec. 24.2. The remainder of Ch. 24 deals with an alternative approach to the EPR 
problem in which one adds an additional mathematical structure, usually referred 
to as “hidden variables”, to the standard quantum Hilbert space of wave functions. 
A simple example of hidden variables in the context of measurements on parti- 
cles in a spin singlet state, due to Mermin, is the topic of Sec. 24.3. It disagrees 
with the predictions of quantum theory for the spin correlation function, and this 
disagreement is not a coincidence, for Bell has shown by means of an inequality 
that any hidden variables theory of this sort must disagree with the predictions of 
quantum theory. The derivation of this inequality is taken up in Sec. 24.4, which 
also contains some remarks on its significance for the (non)existence of mysterious 
nonlocal influences in the quantum world. 


23.2 Spin correlations 

Imagine two spin-half particles a and b traveling away from each other in a region 
of zero magnetic field (so the spin direction of each particle will remain fixed), 
which are described by a wave function 

IXr) = IV'o) ® \co,) t (23.1) 

where \a> t ) is a wave packet co(r a , r h , t) describing the positions of the two parti- 
cles, while 


m - (I z + a )\z~ b ) - \z~)\zt))/V2 (23.2) 

is the singlet state of the spins of the two particles, the state with total angular 
momentum equal to 0. Hereafter we shall ignore \co t ), as it plays no essential role 
in the following arguments, and concentrate on the spin state | i//o) • 

Rather than using eigenstates of S az and S/, z , |i^ 0 ) can be written equally well in 
terms of eigenstates of S aw and 5), UI , where w is some direction in space described 
by the polar angles iJ and (p. The states |u; + ) and |iu“) are given as linear combi- 
nations of |z+) and |z“) in (4.14), and using these expressions one can rewrite |^o) 
in the form 


IV^o) = (N+>K ) _ K )\ w t >)/V2, 


(23.3) 
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Wo) = sin(tf/2)(> >/2 |z+>K> + e i</>/2 \z~)\w^/V2 

+ cos(#/2) (e~ iv l 2 \z+)\Wfr) - e !>/2 |z“)|u^})/\/2, 


(23.4) 


(23.6) 


where ti and < p are the polar angles for the direction w, with w the positive z axis 
when & — 0. The fact that Wo) has the same functional form in (23.3) as in (23.2) 
reflects the fact that this state is spherically symmetrical, and thus does not single 
out any particular direction in space. 

Consider the consistent family whose support is a set of four histories at the two 
times to <t\\ 

\lr 0 O{z+,Zf}{w£,wf}, (23.5) 

where the product of the two curly brackets stands for the set of four projectors 
z a w b’ z a W b> Z a w b ’ ar) d Z a w b ■ The time development operator T (t \ , to) is equal 
to 7, since we are only considering the spins and not the spatial wave function 
oj(r a , r b,t). Thus one can calculate the probabilities of these histories, or of the 
events at t\ given \j/ 0 at to, by thinking of |i^o) in (23.4) as a pre-probability and 
using the absolute squares of the corresponding coefficients. The result is: 

Pr(z+, w£) = Pr (z~, wf) = ± sin 2 (i?/2) = (1 - cos tf)/4, 

Pr( z +, u;") = Pr(z", wf) = \ cos 2 (?7/2) = (1 + cos tf)/4, 

where one could also write Pr(z+ A w^) in place of Pr(z+, ) for the probability 
of S az — +1/2 and Sf )W — +1/2. Using these probabilities one can evaluate the 
correlation function 

C(z, w ) = ((2S az )(2S bw )) = 4Wo\S az S bw Wo) = 

Pr(z+, w+) + Pr(z", wf) - Pr(z+, wf) - Pr (z“, w+) = - cos (23.7) 

Because | f o) is spherically symmetrical, one can immediately generalize these 
results to the case of a family of histories in which the directions z and w in (23.5) 
are replaced by arbitrary directions w a and w/,, which can conveniently be written 
in the form of unit vectors a and b. Since the cosine of the angle between a and b 
is equal to the dot product a • b, the generalization of (23.6) is 

Pr(a+, b+) = Pr(a“, b“) = (1 - a • b)/2, 

Pr(a+, b“) = Pr(a“, b+) = (1 + a • b)/2, 
while the correlation function (23.7) is given by 


(23.8) 


C(a, b) = a b. 


(23.9) 


As will be shown in Sec. 23.5, C(a, b) is also the correlation function for the 
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outcomes, expressed in a suitable way, of measurements of the spin components of 
particles a and b in the directions a and b. 


23.3 Histories for three times 

Let us now consider various families of histories for the times to < h < h, assum- 
ing an initial state at to. One possibility is a unitary history with xfo at all three 
times, but in addition there are various stochastic histories. As a first example, 
consider the consistent family whose support consists of the two histories 

H/+4+ <23 - 10) 

Each history carries a weight of 1/2 and describes a situation in which Sb z — 
—S az , with values which are independent of time for t > to. In particular, one has 
conditional probabilities 

Pr(4t | 4) = Pr (z bl | 4) - Pr(z, 2 | z+) = 1 , (23. 1 1) 

Pr 4i I 4) = Pr 42 I 4) = Pr(4 I 4) = '• (23.12) 


among others, where the time, t\ or t 2 , at which an event occurs is indicated by a 
subscript 1 or 2. Thus if S az = +1/2 at t 2 , then it had this same value at t\, and one 
can be certain that S bz has the value —1/2 at both t\ and t 2 . 

Because of spherical symmetry, the same sort of family can be constructed with 
z replaced by an arbitrary direction w. In particular, with w = x, we have a family 
with support 


Vh O 


jx+x b ©4 , 
\xfx+ 0i+/. 


(23.13) 


Again, each history has a weight of 1 /2, and now it is the values of S ax and S/ x 
which are of opposite sign and independent of time, and the results in (23.11) and 
(23.12) hold with z replaced by x. The two families (23.10) and (23.13) are ob- 
viously incompatible with each other because the projectors for one family do not 
commute with those of the other. There is no way in which they can be combined 
in a single description, and the corresponding conditional probabilities cannot be 
related to one another, since they are defined on separate sample spaces. 

One can also consider a family in which a stochastic branching takes place be- 
tween t\ and t 2 instead of between to and t \ ; thus (23.10) can be replaced with 


iroQiroQ{z+z b ,z a z b }. 


(23.14) 


In this case the last equality in (23.11) remains valid, but the other conditional 
probabilities in (23. 11) and (23. 12) are undefined, because (23. 14) does not contain 
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projectors corresponding to values of S az and S bz at time t\, and they cannot be 
added to this family, as they do not commute with V+- 

One need not limit oneself to families in which the same component of spin 
angular momentum is employed for both particles. The four histories 


VA) O 


Za x b © Z„ x b , 

Za x b © Zt x b> 
z~ x b O z~ x b, 
z~ x b O z~ x b 


(23.15) 


form the support of a consistent family. Since they all have equal weight, one has 
conditional probabilities 


Pr(x+ | z +) = 1/2 = Pr(x fc - | z+), (23.16) 


and others of a similar type which hold for events at both t\ and t 2 , which is why 
subscripts 1 and 2 have been omitted. In addition, the values of S az and Sb x do not 
change with time: 


| ) = 1 = Pr(z- 2 | z-, ), 

Pr ( x b2 \ x b0 = 1 = PT ( x b2 \ x bi )• 


(23.17) 


Yet another consistent family, with support 


V^oO 


j zUb o z+{x+,x fc }, 
\z~zt {4, x;}, 


(23.18) 


where {x^, x/“} denotes the pair of projectors z+x^ and z+xj/, combines features 

of (23.10) and (23.15): values of S az are part of the description at both t\ and t 2 , 
but in the case of particle b, two separate components 5/, and S bx are employed 
at t\ and t 2 . It is important to notice that this change is not brought about by any 
dynamical effect; instead, it is simply a consequence of using (23.18) rather than 
(23.10) or (23.15) as the framework for constructing the stochastic description. In 
particular, one can have a history in which Sb z = +1/2 at t\ and S bx = —1/2 at 
t 2 . This does not mean that some torque is present which rotates the direction of 
the spin from the +z to the — x direction, for there is nothing which could produce 
such a torque. See the discussion following (9.33) in Sec. 9.3. 

The f ami lies of histories considered thus far all satisfy the consistency condi- 
tions, as is clear from the fact that the final projectors are mutually orthogonal. 
Given that three times are involved, inconsistent families are also possible. Here 
is one which will be discussed later from the point of view of measurements. It 
contains the sixteen histories which can be represented in the compact form 


© {x„, x a }{z^", z b } O [z„, z a }{ x t,x b ), 


(23.19) 
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where the product of curly brackets at each of the two times stands for a collection 
of four projectors, as in (23.5). Each history makes use of one of the four projectors 
at each of the two times; for example, 

tyo O x+zf O z~xf (23.20) 

is one of the sixteen histories. Each of these histories has a finite weight, and 
the chain kets of the four histories ending in zfxf, to take an example, are all 
proportional to \z~) \xf), so cannot be orthogonal to each other. 


23.4 Measurements of one spin 

Suppose that the z-component S az of the spin of particle a is measured using a 
Stem-Gerlach apparatus as discussed in Ch. 17. The initial state of the apparatus 
is |Z°), and its interaction with the particle during the time interval from t\ to t 2 
gives rise to a unitary time evolution 


|z+)|Z°> | Z+), \z~)\Z° a ) |Z“>, (23.21) 


where |Z+) and | Z~) are apparatus states (“pointer positions”) indicating the two 
possible outcomes of the measurement. Note that the spin states no longer appear 
on the right side; we are assuming that at f 2 the spin-half particle has become part 
of the measuring apparatus. (Thus (23.21) represents a destructive measurement 
in the terminology of Sec. 17.1. One could also consider nondestructive measure- 
ments in which the value of S az is the same after the measurement as it is before, 
by using (18.17) in place of (23.21), but these will not be needed for the following 
discussion.) The b particle has no effect on the apparatus, and vice versa. That is, 
one can place an arbitrary spin state | w b ) for the b particle on both sides of the 
arrows in (23.21). 

Consider the consistent family with support 


*6© 

k 

k 

z b © Z a Z b 

z t © Z a z t 


(23.22) 

- 

l^o>|z; 

D 1 ). The conditional probabilities 


Pr(4 1 

z + 2 ) 

= 

1=^1 

k fl - 2 ), 

(23.23) 

I 

z « + 2 ) 

= 

l=Pr(z+| 

z; 2 ), 

(23.24) 

Pr (Zfe2 1 

z; 2 ) 

= 

l= p r(4| 

z ; 2 ), 

(23.25) 

Pr ^fo2 

i 

- 

1=Pt(ZmI 

z bl ) 

(23.26) 


are an obvious consequence of (23.22). The first pair, (23.23), tell us that the 
measurement is, indeed, a measurement: the outcomes Z* actually reveal values 
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of S az before the measurement took place. Those in (23.24) and (23.25) tell us 
that the measurement is also an indirect measurement of Sb z for particle b, even 
though this particle never interacts with the apparatus that measures S az , since the 
measurement outcomes Z+ and Z~ are correlated with the properties and zjj". 

There is nothing very surprising about carrying out an indirect measurement of 
the property of a distant object in this way, and the ability to do so does not indicate 
any sort of mysterious long-range or nonlocal influence. Consider the following 
analogy. Two slips of paper, one red and one green, are placed in separate opaque 
envelopes. One envelope is mailed to a scientist in Atlanta and the other to a 
scientist in Boston. When the scientist in Atlanta opens the envelope and looks at 
the slip of paper, he can immediately infer the color of the slip in the envelope in 
Boston, and for this reason he has, in effect, carried out an indirect measurement. 
Furthermore, this measurement indicates the color of the slip of paper in Boston 
not only at the time the measurement is carried out, but also at earlier and later 
times, assuming the slip in Boston does not undergo some process which changes 
its color. In the same way, the outcome, Z+ or Z“, for the measurement of Sh- 
allows one to infer the value of Sb z both at t\ and at t2, and at later times as well 
if one extends the histories in (23.22) in an appropriate manner. In order for this 
inference to be correct, it is necessary that particle b not interact with anything, 
such as a measuring device or magnetic field, which could perturb its spin. 

The conditional probabilities in (23.26) tell us that Sb z is the same at as at 
t\, consistent with our assumption that particle b has not interacted with anything 
during this time interval. Note, in particular, that carrying out a measurement on 
S az has no influence on .S),,, which is just what one would expect, since particle b is 
isolated from particle a, and from the measuring apparatus, at all times later than 
to- 

A similar discussion applies to a measurement carried out on some other com- 
ponent of the spin of particle a. To measure S ax , what one needs is an apparatus 
initially in the state |X°), which during the time interval from t\ to to interacts with 
particle a in such a way as to give rise to the unitary time transformation 

|x a + )|Z°> \x+)\X+), \x~)\X° a ) \x~)\X~). (23.27) 


The counterpart of (23.22) is the consistent family with support 


*o© 


\ X a X b © X a x b > 

\x~ x b O X a xl . 


(23.28) 


where the initial state is now | A'q ) = | r/r 0 ) \X' a ). Using this family, one can calculate 
probabilities analogous to those in (23.23)-(23.26), with z and Z replaced by x and 
X. Thus in this framework a measurement of S ax is an indirect measurement of Sf )X , 
and one can show that the measurement has no effect upon Sb x . 
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Comparing (23.22) with (23.10), or (23.28) with (23.13) shows that the fami- 
lies which describe measurement results are close parallels of those describing the 
system of two spins in the absence of any measurements. To include the measure- 
ment, one simply introduces an appropriate initial state at to, and replaces one of 
the lower case letters at (2 with the corresponding capital to indicate a measurement 
outcome. This should come as no surprise: apparatus designed to measure some 
property will, if it is working properly, measure that property. Once one knows 
how to describe a quantum system in terms of its microscopic properties, the ad- 
dition of a measurement apparatus of an appropriate type will simply confirm the 
correctness of the microscopic description. 

Replacing lower case with capital letters can also be used to construct measure- 
ment counterparts of other consistent f ami lies in Sec. 23.3. The counterpart of 
(23.14) when S az is measured is the family with support 

% © % © {Z+zf , Z~zt}. (23.29) 

Using this family one can deduce the conditional probabilities in (23.25) referring 
to the values of Sb z at to, and thus the measurement of S az , viewed within this 
framework, is again an indirect measurement of Sb z at to. However, the results in 
(23.23), (23.24), and (23.26) are not valid for the family (23.29), because values of 
S az and Sb z cannot be defined at t \ : the corresponding projectors do not commute 
with the xfo part of 4^. 

One reason for introducing (23.29) is that it is the family which comes closest 
to representing the idea that a measurement is associated with a collapse of the 
wave function of the measured system. In the case at hand, the measured system 
can be thought of as the spin state of the two particles, but since particle a is no 
longer relevant to the discussion at t 2 , collapse should be thought of as resulting 
in a state \zf) or | zjj") for particle b, depending upon whether the measurement 
outcome is Z+ or Z“ . (In the case of a nondestructive measurement on particle a 
the states resulting from the collapse would be |z+)|z^} and \z~)\z^).) As pointed 
out in Sec. 18.2, wave function collapse is basically a mathematical procedure for 
computing certain types of conditional probabilities. Regarding it as some sort 
of physical process gives rise to a misleading picture of instantaneous influences 
which can travel faster than the speed of light. The remarks in Sec. 18.2 with 
reference to the beam splitter in Fig. 18.1 apply equally well to spatially separated 
systems of spin-half particles, or of photons, etc. 

One way to see that the measurement of S az is not a process which somehow 
brings Sb z into existence at to is to note that the change between t\ and the final 
time ?2 in (23.29) is similar to the change which occurs in the family (23.14), where 
there is no measurement. Another way to see this is to consider the family whose 
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support consists of the four histories 

% o % o {z+, z;}{*+, *-} (23.30) 


in the compact notation used earlier in (23.5). This resembles (23.29), except that 
the components of rather than Sb z appear at to. Were the measurement having 
some physical effect on particle b, it would be just as sensible to suppose that it 
produces random values of Sb x , as that it results in a value of S/, z correlated with 
the outcome of the measurement! 

It was noted earlier that (23.26) implies that measuring S az has no effect upon 
Sb z ■ Nor does such a measurement influence any other component of the spin of 
particle b, as can be seen by constructing an appropriate consistent family in which 
this component enters the description at both t\ and t 2 . Thus in the case of Si, x one 
can use the measurement counterpart of (23.15), a family with support 


^0 


zUt O Z+x+, 
zUb O z+x h , 
Za4 O Z a4’ 
z~ x b O Z~x^ . 


(23.31) 


It is then evident by inspection that Sbx is the same at t\ and to. Using this family 
one obtains the conditional probabilities 


Pr(x+|Z+) = l/2 = Pr(x fc -|Z+), 
Pr(x+|Z; 2 ) = l/2 = Pr(x,-|Z a - 2 ), 


where the subscript indicating the time has been omitted from xf, since these re- 
sults apply equally at t\ and to. Of course (23.32) is nothing but the measurement 
counterpart of (23.16). It tells one that a measurement of S az can in no way be 
regarded as an indirect measurement of Sbx- Similar results are obtained if the pro- 
jectors corresponding to Sbx in (23.31) are replaced by those corresponding to Sb w 
for some other direction w, except that the conditional probabilities for w\ \ and 
in the expression corresponding to (23.32) will depend upon w. If w is close to z, 
a measurement of S az is an approximate indirect measurement of Sb w in the sense 
that Sb w — —S az f° r m °st experimental runs, with occasional errors. 

The family 


*0# 


jz+z 6 -OZ+{x+x fc -}, 

\z~zt O Z~ [x £ , x ^ } 


(23.33) 


is the counterpart of (23.18) when S az is measured. Here the events involving the 
spin of particle b are different at t 2 from what they are at t\ . However, just as in 
the case of (23.18), for which no measurement occurs, one should not think of 
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this change as a physical consequence of the measurement. See the discussion 
following (23.18). 


23.5 Measurements of two spins 

Thus far we have only considered measurements on particle a. One can also imag- 
ine carrying out measurements on the spins of both particles. All that is needed 
is a second measuring device of a type appropriate for whatever component of the 
spin of particle b is of interest. If, for example, this is S bx , then the unitary time 
transformation from t\ to to will be the same as (23.27) except for replacing the sub- 
script a with b. In what follows it will be convenient to assume that measurements 
are carried out on both particles at the same time. However, this is not essential; 
analogous results are obtained if measurements are carried out at different times. 
The properties of a particle will, in general, be different before and after it is mea- 
sured, but the time at which a measurement is carried out on the other particle is 
completely irrelevant. 

For the combined system of two particles and two measuring devices a typical 
unitary transformation from t\ to h takes the form: 

\z' a )\x h )\Z° a )\Xl) M. \z+)\x;). (23.34) 


Once again, one can generate consistent f ami lies for measurements by starting off 
with any of the consistent f ami lies in Sec. 23.3, replacing xfo with an appropriate 
initial state which includes each apparatus in its ready state, and then replacing 
lower case letters at the final time with corresponding capitals. For example 


*«fo 


jzU;oz+z;, 

\z- a z + b QZ- a Z + b , 


(23.35) 


with |'F(j z ) = \fti)\Z' a )\Z r b ), is the counterpart of (23.10), and it shows that the 
outcomes of measurements of S az and 5),, will be perfectly anticorrelated: 


Pr (Zf | Z+) = 1 = Pr(Z+ | Z~). (23.36) 


Not only does one obtain consistent f ami lies by this process of “capitalizing” 
those in Sec. 23.3, the weights for histories involving measurements are also pre- 
cisely the same as their counterparts that involve only particle properties. This 
means that the correlation function C( a, b) introduced in Sec. 23.1 can be applied 
to measurement outcomes as well as to microscopic properties. To do this, let a (in) 
be +1 if the apparatus designed to measure S aw is in the state | W+) at t 2 , and —1 
if it is in the state | W~), and define /I (in) in the same way for measurements on 
particle b. Then we can write 


C(a, b) = (a(w a )P(w b )) - -a b 


(23.37) 
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as the average over a large number of experimental runs of the product a(w a )/3(wb) 
when w a is a and Wb is b. 

The physical significance of C in (23.37) is, of course, different from that in 
(23.9). The former refers to measurement outcomes and the latter to properties 
of the two particles. However, they are identical functions of a and b, and given 
that the measurements accurately reflect previous values of the corresponding spin 
components, no confusion will arise from using the same symbol in both cases. 
One could also, to be sure, define the same sort of correlation for a case in which a 
spin component is measured for only one particle, using the product of the outcome 
of that measurement, understood as ±1, with twice the value (in units of h) of the 
appropriate spin component for the other particle; for example 

C(w, w') = (, ot(w)2S bw ,). (23.38) 


As noted in Sec. 23.4, the outcome of a measurement of the z component of the 
spin of particle a can be used to infer the value of S az before the measurement, 
and the value of Sb z for particle b as long as that particle remains isolated. The 
roles of particles a and b can be interchanged: a measurement of Sb z for particle 
b allows one to infer the value of S az . And because of the spherical symmetry of 
i//o, the same results hold if z is replaced by any other direction w. How are these 
results modified, or extended, if the spins of both particles are measured? If the 
same component of spin is measured for particle b as for particle a, the results are 
just what one would expect. Suppose it is the z component, then (23.35) shows that 
one can infer both and zj, on the basis of the outcome Z+, or of the outcome 
Z^, a result which is not surprising since one outcome implies the other, (23.36). 

Things become more complicated if the a and b measurements involve different 
components, and in this case it is necessary to pay careful attention to the frame- 
work one is using for inferring microscopic properties from the outcomes of the 
measurements. To illustrate this, let us suppose that S az is measured for particle a 
and Sbx for particle b. One consistent family that can be used for analyzing this 
situation is the counterpart of (23.15): 


% x O 


O Z+X+, 
z^Xb OZ+Xj, 
z~ x tQZ~X+, 

Za X b © Z a X b ■ 


(23.39) 


Here the initial state is = Wo)\Z' a )\X' b ) . Using this family allows one to 

infer from the outcome of each measurement something about the spin of the same 
particle at an earlier time, but nothing about the spin of the other particle. Thus one 
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has 


Pr(4 I Z + 2 ) = 1 = Mz a] | Z- ), (23.40) 

Pr(x+ | X + h2 ) = 1 = Pr(* fo -j | Z- 2 ), (23.41) 


but there is no counterpart of (23.24) relating Sb z to Zf, nor a way to relate S ax 
to Xf, because the relevant projectors, such as z b , are not present in (23.39) at t\, 
nor can they be added, since they do not commute with the projectors which are 
already there. 

On the other hand, the family with support 


% K e 


\z + a z- b QZ+{X + b ,X- b ), 

\z-ztOZf{X+,Xf), 


(23.42) 


which is the counterpart of (23.18) and (23.33), can be used to infer values of Sb z 
from the outcomes Zf. By using it, one obtains the conditional probabilities 

Pr(z b] | Z+) = 1 = Pr(z+ | Z; 2 ) (23.43) 


in addition to (23.40). However, if one uses (23.42) the outcome of the b mea- 
surement tells one nothing about Sb x at t \ . It is worth noting that a refinement of 
(23.42) in which additional events are added at a time 0 . 5 , so that the histories 




jz+*+OZ+X+, 

Utx;o z+xf, 
j z-xt&Z-Xt, 
x b O Z a X b 


(23.44) 


are defined at to < h < t\ .5 < is the support of a consistent family in which one 
can infer from X b or Xf at t 2 the value of S( )X at 0 . 5 , but not at an earlier time. As 
this is a refinement of (23.42), both (23.40) and (23.43) remain valid. 

The consistent family with support 


4^0 


| X a X b © { Z ai K)Xf, 
\x~xt O {z+, ZfjX+ 


(23.45) 


is the counterpart of (23.42) with v rather than z -components at t\ . One can use it 
to infer the v -component of the spin of either particle at t\ from the outcome of the 
Sb x measurement: 


Pr(4 I Z+) = 1 = Pr(*-, | X~ 2 ), (23.46) 

Pr(x;, | X+ 2 ) = 1 - Pr(x+ | X- h2 ) . (23.47) 


Given the conditional probabilities in (23.43) and (23.47), and no indication of 
the consistent families from which they were obtained, one might be tempted to 
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combine them and draw the conclusion that for a run in which the measurement 
outcomes are, say, Z+ and X b at t 2 , both S ax and S bz had the value —1 /2 at t \ : 

p r(x;, A z b] | Z| A X+ 2 ) = 1 . (23.48) 

This, however, is not correct. To begin with, the frameworks (23.42) and (23.45) 
are mutually incompatible because of the projectors at h, so they cannot be used 
to derive (23.48) by combining (23.43) with (23.47). Next, if one tries to construct 
a single consistent family in which it might be possible to derive (23.48), one runs 
into the following difficulty. A description which ascribes values to both S ax and 
S bz at t\ requires a decomposition of the identity which includes the four projectors 
xj zl , Xa z b> X a z t ’ ar) d X a z b • This by itself is not a problem, but when combined 
with the four measurement outcomes, the result is the inconsistent family 

O if « *«}{#, z b) o Z~}{X+, xp (23.49) 

obtained by replacing ifro with and capitalizing v and z at t 2 in (23.19). The 
same arguments used to show that (23.19) is inconsistent apply equally to (23.49); 
adding measurements does not improve things. Consequently, because it cannot be 
obtained using a consistent family, (23.48) is not a valid result. 
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24.1 Bohm version of the EPR paradox 

Einstein, Podolsky, and Rosen (EPR) were concerned with the following issue. 
Given two spatially separated quantum systems A and B and an appropriate initial 
entangled state, a measurement of a property on system A can be an indirect mea- 
surement of B in the sense that from the outcome of the A measurement one can 
infer with probability 1 a property of B, because the two systems are correlated. 
There are cases in which either of two properties of B represented by noncommut- 
ing projectors can be measured indirectly in this manner, and EPR argued that this 
implied that system B could possess two incompatible properties at the same time, 
contrary to the principles of quantum theory. 

In order to understand this argument, it is best to apply it to a specific model 
system, and we shall do so using Bohm’s formulation of the EPR paradox in which 
the systems A and B are two spin-half particles a and b in two different regions of 
space, with their spin degrees of freedom initially in a spin singlet state (23.2). As 
an aid to later discussion, we write the argument in the form of a set of numbered 
assertions leading to a paradox: a result which seems plausible, but contradicts the 
basic principles of quantum theory. The assertions E1-E4 are not intended to be 
exact counterparts of statements in the original EPR paper, even when the latter 
are translated into the language of spin-half particles. However, the general idea is 
very similar, and the basic conundrum is the same. 

El. Suppose S az is measured for particle a. The result allows one to predict S bz 
for particle b, since S bz = —S az . 

E2. In the same way, the outcome of a measurement of S ax allows one to predict 
S bx , since S bx = — S ax . 

E3. Particle b is isolated from particle a, and therefore it cannot be affected by 
measurements carried out on particle a. 

E4. Consequently, particle b must simultaneously possess values for both S bz 
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and S[, x , namely the values revealed by the corresponding measurements on 
particle a, either of which could be carried out in any given experimental 
run. 

E5. But this contradicts the basic principles of quantum theory, since in the two- 
dimensional spin space one cannot simultaneously assign values of both S z 
and S x to particle b. 

Let us explore the paradox by asking how each of these assertions is related to 
a precise quantum mechanical description of the situation. We begin with El, and 
employ the notation in Sec. 23.4, with the particles initially in a spin singlet state 
1 ^ 0)5 and an apparatus designed to measure S az initially in the state |Z°) at time 
to- The interaction of particle a with the apparatus during the time interval from t\ 
to ?2 gives rise to the unitary time transformation (23.21). We then need a consis- 
tent family which includes the possible outcomes Z+ and Z~ of the measurement, 
corresponding to S az — +1/2 and —1/2, together with the values of Sf )Z . 

It is useful to begin with the family in (23.29), since it comes the closest among 
all the families in Sec. 23.4 to representing how physicists would have thought 
about the problem in 1935, when the EPR paper was published. In this family 
the initial state evolves unitarily until after the measurement has occurred, when 
there is a split (or “collapse”) into the two possibilities Z+z/ and Z~zl- Using 
this family one can deduce Sb z = — 1/2 from the measurement outcome Z+, and 
Sb z = +1/2 from Z~; the results can be expressed formally as conditional proba- 
bilities, (23.25). This means that El is in agreement with the principles of quantum 
theory. 

Even stronger results can be obtained using the family (23.22) in which the 
stochastic split takes place at an earlier time. In this family it is possible to view 
the measurement of S az as revealing a pre-existing property of particle a at a time 
before the measurement took place, a value which was already the opposite of Sj, z . 
In addition, the value of Sb z was unaffected by the measurement of S az , a fact ex- 
pressed formally by the conditional probabilities in (23.26). Thus this family both 
confirms El and lends support to E3. Additional support for E3 comes from the 
family (23.31), which shows that a measurement of S az does not have any effect 
upon Sbx, and of course one could set up an analogous family using any other 
component of spin of particle b, and reach the same conclusion. 

Next we come to E2. It is nothing but El with S z replaced by S x for both par- 
ticles, so the preceding discussion of El will apply to E2, with obvious modifi- 
cations. The family (23.28) with its apparatus for measuring S ax must be used in 
place of (23.22), and from it one can deduce the counterparts of (23.23)-(23.26) 
with z and Z replaced by x and X. And of course the S ax measurement will not 
alter any component of the spin of particle b, which confirms E3. 
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Assertion E4 would seem to be an immediate consequence of those preceding it 
were it not for the requirement that quantum reasoning employ a single framework 
in order to reach a sound conclusion, Sec. 16.1. Assertions El and E2 have been 
justified on the basis of two distinct consistent families, (23.22) and (23.28). Are 
these families compatible, that is, can they be combined in a single framework? 
One’s first thought is that they cannot be combined, because the projectors for the 
properties associated with S az and 5),, at t\ (the intermediate time) in (23.22) obvi- 
ously do not commute with those in (23.28), which are associated with S ax and Sf )X , 
and the same is true of the projectors at 1 2 . However, the situation is not so sim- 
ple. The projectors representing the complete histories in (23.22) are orthogonal 
to, and hence commute with, the history projectors in (23.28), because the initial 
states |Z°) and \X' a ) for the apparatus will be orthogonal. This follows from the 
fact that an apparatus designed to measure S z will differ in a visible (macroscopic) 
way from one designed to measure S x ; see the discussion following (17.10). 

Consequently, (23.22) and (23.28) can be combined in a single consistent family 
with two distinct initial states: the spin singlet state of the particles combined with 
either of the measuring apparatuses. However, the resulting framework does not 
support E4. The reason is that the two initial states are mutually exclusive, so that 
only one or the other will occur in a particular experimental run. Consequently, 
the conclusion that Sb z will have a particular value, at t\ or ti, as determined by the 
measurement outcome, is only correct for a run in which the apparatus is set up to 
measure S az , and the corresponding conclusion for Sf )X only holds for runs in which 
the apparatus is set up to measure S ax . But E4 asserts that particle h simultaneously 
possesses values of S z and S x , and this conclusion obviously cannot be reached 
using the framework under consideration. 

To put the matter in a different way, El is correct in a situation in which S az is 
measured, and E2 in a situation in which S ax is measured. But there is no way to 
measure S az and S ax simultaneously for a single particle, and therefore no situation 
in which El and E2 can be applied to the same particle. Einstein, Podolsky and 
Rosen were aware of this type of objection, as they mention it towards the end of 
their paper, and they respond in a fashion which can be translated into the language 
of spin-half particles in the following way. If one allows that an S az measurement 
can be used to predict Sb z and an S ax measurement to predict Sf )X , but then asserts 
that Sb x does not exist when S az is measured, and .S) ; , does not exist when S ax is 
measured, this makes the properties of particle b depend upon which measurement 
is carried out on particle a, and no reasonable theory could allow this sort of thing. 

There is nothing in the analysis presented in Sec. 23.4 to suggest that the prop- 
erties of particle b depend in any way upon the type of measurement carried out on 
particle a. However, the type of property considered for particle b, Sb z as against 
Sb x , depends upon the choice of framework. There are frameworks, such as (23.22) 
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and (23.29), in which a measurement of S az is combined with values of Sb z , and 
other frameworks, such as (23.30) and (23.31), in which a measurement of S az is 
combined with values for Sb x . Quantum theory does not specify which framework 
is to be used for a situation in which S az is measured. However, only a framework 
which includes Sb z can be used to correlate the outcome of an S az measurement 
with some property of the spin of particle b in a way which constitutes an indirect 
measurement of the latter. 

Thus implicit in the analysis given in the EPR paper is the assumption that quan- 
tum theory is limited to a single framework in the case of an S az measurement, 
one corresponding to a wave function collapse picture, (23.29), for this particular 
measurement. Once one recognizes that there are many possible frameworks, the 
argument no longer works. One can hardly fault Einstein and his colleagues for 
making such an assumption, as they were seeking to point out an inadequacy of 
quantum mechanics as it had been developed up to that time, with measurement 
and wave function collapse essential features of its physical interpretation. One 
can see in retrospect that they had, indeed, located a severe shortcoming of the 
principal interpretation of quantum theory then available, though they themselves 
did not know how to remedy it. 


24.2 Counterfactuals and the EPR paradox 

An alternative way of thinking about assertion E4 in the previous section is to 
consider a case in which S az is measured (and thus .S) ; , is indirectly measured), 
and ask what would have been the case, in this particular experimental run, if S ax 
had been measured instead, e.g., by rotating the direction of the field gradient in 
the Stem-Gerlach apparatus just before the arrival of particle a. This requires a 
counterfactual analysis, which can be carried out with the help of a quantum coin 
toss in the manner indicated in Sec. 19.4. Let the total quantum system be described 
by an initial state 

l<f>0> = m\Q), (24.1) 

where \\fso) is the spin singlet state (23.2), and | Q) the initial state of the quantum 
coin, servomechanism, and the measuring apparatus. (As it is not important for 
the following discussion, the center of mass wave function | u> t ), (23.1), has been 
omitted, just as in Ch. 23.) It will be convenient to assume that the quantum coin 
toss corresponds to a unitary time development 

\Q)^ (\X° a ) + \Z° a ))/V2, (24.2) 

during the interval from t\ to t 2 , and that the measurement of S ax or S az takes place 
during the time interval from t 2 to t 2 , rather than between t\ and t 2 as in Ch. 23. 
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Here \X r a ) and |Z°) are states of the apparatus in which it is ready to measure S ax 
and S az , respectively, and the servomechanism, etc., is thought of as included in 
these states. Thus the overall unitary time development from the initial time to to 
the final time is given by 

|d, 0 > |<D 0 > |^o>(l^) + \Z° a ))/V2 

> - \4)\ x ~a) + kb) \ z a) - kt)\Z~))/2. (24.3) 

The final step from f 2 to tj, is obtained by assuming that (23.27) applies when the 
apparatus is in the state |X°), and (23.21) when it is in the state | Z°) at tj. 

A consistent family T\ which provides one way of analyzing the counterfactual 
question posed at the beginning of this section has for its support six histories for 
times to < ti < t 2 < t-j. It is convenient to arrange them in two groups of three: 


<t>o O zlz b O 


Z° o Z+z b , 

JX+z,-, ^Qz~z 

\z- a z b , 


I!0 


z°o 


Z~ a z + b : 

Kzt, 

X-zt. 


(24.4) 


Suppose the coin toss resulted in S az being measured, and the outcome was Z+, 
implying S bz = — 1 /2. To answer the question of what would have happened if S ax 
had been measured instead, use the procedure of Sec. 19.4 and trace the outcome 
Z+ z b in the first set of histories in (24.4) backwards to the pivot z b and then 
forwards through the X° node to the corresponding events at t^. One concludes that 
had the quantum coin toss resulted in a measurement of S ax , the outcome would 
have been X+ or X~, each with probability 1 /2, but in either case S bz would have 
had the value —1/2, corresponding to z b , that is to say, the same value it had in 
the actual world in which S az , not S ax , was measured. This conclusion seems very 
reasonable on physical grounds, for one would not expect a last minute choice to 
measure S x rather than S z for particle a to have any influence on the distant particle 
b, since the measuring apparatus does not interact in any way with particle b. To 
put the matter in another way, the conclusion of this counterfactual analysis agrees 
with the discussion of E3 in Sec. 24.1. 

On the other hand, (24.4) by itself provides no immediate support for E4, for it 
supplies no information at all about S bx . Of course, this is only one consistent fam- 
ily, and one might hope to do better using some other framework. One possibility 
might be the consistent family Ti with support 


z°o 


\ZU b , 
\z~ a zt, 
J x b ’ 
\x~xt, 


4*0 O 4>o O 


(24.5) 
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which corresponds pretty closely to the notion of wave function collapse. Once 
again assume that the quantum coin toss leads to an S az measurement, and that the 
outcome of this measurement is Z+. Using O 0 at to or t\ as the pivot, one concludes 
that had S ax been measured instead, S bx would have been —1/2 for the outcome 
X+, and +1/2 for the outcome X~. 

This result seems encouraging, for we have found a consistent family in which 
both S hz and S bx values appear, correlated in the expected way with S az and S ax 
measurements. However the S bz states zf and the S bx states xf in (24.5) are con- 
textual properties in the sense of Ch. 14: z b and z b both depend on Z°, and x b and 
x b both depend on X°. This means — see the discussion in Ch. 14 — that when 
using (24.5), one cannot think of S bz and S bx as having values independent of the 
quantum coin toss. Only if the toss results in Z° is it meaningful to talk about S bz , 
and only if it results in X° can one talk about S bx . And since the two outcomes 
of the quantum coin toss are mutually-exclusive possibilities, one and only one of 
which will occur in any given experimental run, we have again failed to establish 
E4, and for basically the same reason pointed out in Sec. 24. 1 when discussing 
the family with two initial states that combines (23.22) and (23.28). Indeed, in the 
latter family Sb z and S bx are contextual properties which depend upon the corre- 
sponding initial states — something we did not bother to point out in Sec. 24.1 
because dependence (in the technical sense used in Ch. 14) on earlier events never 
poses much of an intuitive problem. But does this contextuality mean that there is 
some mysterious long-range influence in that a last minute choice to measure S ax 
rather than S az would somehow determine whether particle b has a definite value of 
S x rather that 5,? No, for dependence or contextuality in the technical sense used 
in Ch. 14 denotes a logical relationship brought about by choosing a framework in 
a particular way, and does not indicate any sort of physical causality. Thus there is 
no contradiction with the arguments presented in Sec. 24.1 in support of E3. 

The reader with the patience to follow the analysis in this and the previous sec- 
tion may with some justification complain that the outcome was already certain at 
the outset: if E4 really does contradict the basic principles of quantum theory, as 
asserted by E5, then it is evident that it can never be obtained by an analysis based 
upon those principles. True enough, but there are various reasons why working 
out the details is still worthwhile. First, there is no way to establish with absolute 
certainty the consistency of the basic principles of a physical theory, as it is always 
something more than a piece of abstract mathematics or logic; one has to apply 
these principles to various examples and see what they predict. Second, it is of 
some interest to find out where and why the seemingly plausible chain of argu- 
ments from El to E5 comes apart, for this tells us something about the difference 
between quantum and classical physics. The preceding analysis shows that it is ba- 
sically violations of the single-framework rule which cause the trouble, and in this 
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respect the EPR paradox has quite a bit in common with the paradoxes discussed 
in previous chapters. But the nonclassical behavior of contextual events can also 
play a role, depending on how one analyzes the paradox. 

Third, the analysis supports the correctness of the basic locality assumption of 
EPR as expressed in E3, an assertion which is confirmed by the analysis in Ch. 23. 
Given that the EPR paradox has sometimes been cited to support the claim that 
there are mysterious nonlocal influences in the quantum world, it is worth em- 
phasizing that the analysis given here does not show any evidence of such influ- 
ences. On the other hand, certain modifications of quantum mechanics in which 
the Hilbert space is supplemented by “hidden variables” of a particular sort will 
necessarily involve peculiar nonlocal influences if they are to reproduce the spin 
correlations (23.9) of standard quantum theory, and these are the subject of the 
remaining sections in this chapter. 


24.3 EPR and hidden variables 

A hidden variable theory is an alternative approach to quantum mechanics in which 
the Hilbert space of the standard theory is either replaced by or supplemented with a 
set of “hidden” (the name is not particularly apt) variables which behave like those 
one is accustomed to in classical mechanics. One of the best-known examples was 
proposed in 1952 by Bohm, using an approach similar to one employed earlier by 
de Broglie, in which at any instant of time all particles have precise positions, and 
these positions constitute the new (hidden) variables. 

The simplest hidden variable model of a spin-half particle is one in which the 
different components of its spin angular momentum simultaneously possess well- 
defined values, something which is not true if one uses a quantum Hilbert space, for 
reasons discussed in Sec. 4.6. A measurement of some component of spin using a 
Stem-Gerlach apparatus will then reveal the value that the corresponding (hidden) 
variable had just before the measurement took place. More complicated models 
are possible, but the general idea is that measurement outcomes are determined by 
variables that behave classically in the sense that they simultaneously possess defi- 
nite values. John Bell pointed out in 1964 that hidden variable models of this kind 
cannot reproduce the correlation function C(a, b), (23.9) or (23.37), for spin-half 
particles in an initial singlet state, if one makes the reasonable assumption that no 
mysterious long-range influences link the particles and the measuring apparatuses. 
This result led to a number of experimental measurements of the spin correlation 
function. Most of the experiments have used the polarizations of correlated pho- 
tons rather than spin-half particles, but the principles are the same, and the results 
are in good agreement with the predictions of quantum mechanics. Note that one 
can think of this correlation function as referring to particle spins in the absence of 
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any measurement when one uses the framework (23.5), or as the correlation func- 
tion between outcomes of measurements of the spins of both particles, (23.37). In 
line with most discussions of Bell’s result, we shall think of C (a, b) as referring to 
measurement outcomes. 

Before exhibiting one version of Bell’s argument in Sec. 24.4, it is useful to look 
at a specific setup discussed by Mermin. Imagine two apparatuses, one to measure 
the spin of particle a and the other the spin of particle b, each of which can mea- 
sure the component of spin angular momentum in one of three directions in space, 

u, v, and x, lying in the x, y plane, with an angle of 120° between every pair of 
directions, Fig. 24. 1. The component of spin which will be measured is determined 
by a switch setting on the apparatus, and these settings will also be denoted by u, 

v, and x. Let a(w) = ±1 denote the two possible outcomes of the measurement 
when the switch setting of the a apparatus is w: +1 if the spin is found to be in the 
+ w direction, S aw — +1/2, and — 1 if it is in the opposite direction, S aw = —1/2. 
Let /3(w ) = ±1 be the possible outcomes of the b apparatus measurement when 
its switch setting is w. In any given experiment these results will be random, but if 
they are averaged over a large number of runs, the averages of a(w ) and of fi(w) 
will be zero for any choice of w, whereas the correlation function (23.37) will be 
given by: 

C(w a , w b ) = (<x(w a )P(w b )) = \ 1 lfW “ Wb ' (24.6) 

[ + 1/2 rf w a # Wb, 

since if the switch settings w a and Wb for the a and b apparatuses are unequal, 
the angle between the two directions is 120°, and the inner product of the two 
corresponding unit vectors is — 1 /2. 



Fig. 24.1. Directions u, v, x in the x, y plane. 

Let us try to construct a hidden variable model which can reproduce the cor- 
relation function (24.6). Suppose that particle a when it leaves the source which 
prepares the two particles in a singlet state contains an “instruction set” which will 
determine the outcomes of the measurements in each of the three directions u, v, 
and x. For example, if the particle carries the instruction set (+1, +1, —1), a mea- 
surement of S au will yield the result +1/2, a measurement of S av will also yield 
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+ 1/2, and one of S ax will yield —1/2. Of course, only one of these measurements 
will actually be carried out, the one determined by the switch setting on the ap- 
paratus when particle a arrives. Whichever measurement it may be, the result is 
determined ahead of time by the particle’s instruction set. One can think of the 
instruction set as a list of the components of spin angular momentum in each of the 
three directions, in units of h /2. This is what is called a “deterministic hidden vari- 
able” model because the instruction set, which constitutes the hidden variables in 
this model, determines the later measurement outcome without any extra element 
of randomness. It is possible to construct stochastic hidden variable models, but 
they turn out to be no more successful than deterministic models in reproducing 
the correlations predicted by standard quantum theory. 

There are eight possible instruction sets for particle a and eight for particle b, 
thus a total of sixty-four possibilities for the two particles together. However, the 
perfect anticorrelation when w a — Wb in (24.6) can only be achieved if the instruc- 
tion set for b is the complementary set to that of a, obtained by changing the sign 
of each instruction. If the a set is (+1, +1, —1), the b set must be (—1, —1, +1). 
For were the b set something else, say (+1, —1, +1), then there would be iden- 
tical switch settings, in this case w a — Wb — u, leading to a(u ) — j3(u), which 
is not possible. Similarly, perfect anticorrelations for equal switch settings means 
that the instruction sets, once prepared at the source which produces the singlet 
state, cannot change in a random manner as a particle moves from the source to the 
measuring apparatus. 

We will assume that the source produces singlet pairs with one of the eight in- 
struction sets for a, and the complementary set for b, chosen randomly with a 
certain probability. Let P a {+ H — ) denote the probability that the instruction set 
for a is (+ 1 , + 1 , — 1 ) . The correlation functions can be expressed in terms of these 
probabilities; for example, 

C(u, v ) = C(v, u) = -P a (+ + +) - P a (+ + -) 

+ Pa(+ - +) + Pa(~ + +) + Pa(+ ~ 

+ Pair- + -) - Pair- ~ +) - Pa( )• (24.7) 

Consider the following sum of correlation functions calculated in this way: 

C(w, v ) + C(u, x ) + C(x, v ) = 

-3 P a (+ + +) - 3 P a ( ) + FU+ + -) + P a (+ ~ +) 

+ Pa(- + +) + Pa(+ - -) + Pa(~ + -) + P a (~ - +)• (24.8) 

Since the probabilities of the different instruction sets add to 1, this quantity has 
a value lying between —3 and +1. However, if we use the quantum mechanical 
values (24.6) for the correlation functions, the left side of (24.8) is 3/2, substantially 
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greater than 1. Thus our hidden variable model cannot reproduce the correlation 
functions predicted by quantum theory. As we shall see in the next section, this 
failure is not an accident; it is something which one must expect in hidden variable 
models of this sort. 


24,4 Bell inequalities 

The inequality (24.10) was derived in 1969 by Clauser, Home, Shimony, and Holt. 
As it is closely related to Bell’s original result in 1964, this CHSH inequality is 
nowadays also referred to as a “Bell inequality”, and by studying it one can learn 
the essential ideas behind such inequalities. We assume that when the a apparatus 
measures a spin component in the direction w a , the outcome is given by a function 
a(w a , A) = ±1 which depends on both w a and a hidden variable, or collection 
of hidden variables, denoted by A. Similarly, the outcome of the b measurement 
for a spin component in the direction Wb is given by a function /3(wb, A) = ±1. 
In the example in Sec. 24.3, w a and Wb can take on any of the three values u, v, 
or x, and A should be thought of as the pair of instruction sets for both particles 
a and b. Hence A could take on sixty-four different values, though we argued in 
Sec. 24.3 that the probabilities of all but eight of these must be 0. For the purpose 
of deriving the inequality, one need not think of w a as a direction in space; it can 
simply be some sort of switch setting on the a apparatus, which, together with the 
value of the hidden variable A associated with the particle, determines the outcome 
of the measurement through the function a(w a , A). The same remark applies to the 
b apparatus and the function P(w b , A). Also, the derivation makes no use of the 
fact that the two spin-half particles are initially in a spin singlet state. 

The source which produces the correlated particles produces different possible 
values of A with a probability p( A), so the correlation function is given by 

C(w a , w b ) = P(A) oc(w a , A) 0(w b , A). (24.9) 

x 

(If A is a continuous variable, Ylx P( A) should be replaced by f p(A) c/A.) Let a, 
a' be any two possible values for w a , and b and b' any two possible values for 
Wb . Then as long as a(w a , A) and fiiwb, A) are functions which take only the two 
values +1 or —1, the correlations defined by (24.9) satisfy the inequality 

| C(a, b) + C(a, b’) + C(a', b ) - C(a', b')\ < 2. (24.10) 

To see that this is so, consider the quantity 

a(a, A) /3(b, A) + a(a, A) P(b', A) + a(a', A) p(b. A) - a(a', A) p(b', A) 

= [a (a, A) + a(a', A)] p(b, A) + [a(a, A) - a(a\ A)] p(b' , A). (24.11) 

It can take on only two values, +2 and —2, because each of the four quantities 
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a(a, X), a(a', X), f(b, X) and f(b', X) is either +1 or —1. Thus either a(a, X) — 
a(a', X), so that the right side of (24.11) is 2 a(a, X)f(b, X), or else a(a, X) = 
—a(a', X), in which case it is 2 a(a, X)fi(b’ , X). If one multiplies (24.11) by p( X) 
and sums over X, the result of this weighted average is 

C(a, b ) + C(a, b') + C(a', b) - C(a', b'). (24.12) 

A weighted average of a quantity which takes on only two values must lie between 
them, so (24.12) lies somewhere between —2 and +2, which is what (24.10) as- 
serts. 

Consider the example in Sec. 24.3, and set a = u, b — v, a' — b' = x. If 
one inserts the quantum values (24.6) for these correlation functions in (24.12), the 
result is 3 x 1/2+ 1 = 2.5, which obviously violates the inequality (24.10). On the 
other hand, the hidden variable model in Sec. 24.3 assigns to the sum C(n, v) + 
C(u, x ) + C(x, v ), see (24.8), a value between —3 and +1, and since C(x, x) = 
— 1, the inequality (24.10) will be satisfied. 

If quantum theory is a correct description of the world, then since it predicts cor- 
relation functions which violate (24.10), one or more of the assumptions made in 
the derivation of this inequality must be wrong. The first and most basic of these as- 
sumptions is the existence of hidden variables with a mathematical structure which 
differs from the Hilbert space used in standard quantum mechanics. This assump- 
tion is plausible from the perspective of classical physics if measurements reveal 
pre-existing properties of the measured system. In quantum physics it is also the 
case that a measurement reveals a pre-existing property provided this property is 
part of the framework which is being used to construct the quantum description. If 
S az is measured for particle a, the outcome of a suitable (ideal) measurement will 
be correlated with the value of this component of spin angular momentum before 
the measurement in a framework which includes \zf) and \zf). However, there is 
no framework which includes the eigenstates of both S az and S aw for a direction w 
not equal to z or — z. 

Thus the point at which the derivation of (24.10) begins to deviate from quantum 
principles is in the assumption that a function a(w a , X) exists for different direc- 
tions w a . As long as only a single choice for w a is under consideration there is no 
problem, for then the “hidden” variable X can simply be the value of S aw at some 
earlier time. But when two (excluding the trivial case of w a and —w a ) or even 
more possibilities are allowed, the assumption that a(w a , X) exists is in conflict 
with basic quantum principles. Precisely the same comments apply to the function 
P(w b ,X). 

Of course, if postulating hidden variables is itself in error, there is no need to 
search for problems with the other assumptions having to do with the nature of 
these hidden variables. Nonetheless, let us see what can be said about them. A 
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second assumption entering the derivation of (24.10) is that the hidden variable 
theory is local. Locality appears in the assumption that the outcome a(w a , X) of 
the a measurement depends on the setting w a of this piece of apparatus, but not the 
setting Wb for the b apparatus, and that /3(wb, X) does not depend upon w a . These 
assumptions are plausible, especially if one supposes that the particles a and b and 
the corresponding apparatuses are far apart at the time when the measurements take 
place. For then the settings w a and uy, could be chosen at the very last moment 
before the measurements take place, and it is hard to see how either value could 
have any influence on the outcome of the measurement made by the other appara- 
tus. Indeed, for a sufficiently large separation, an influence of this sort would have 
to travel faster than the speed of light, in violation of relativity theory. 

The claim is sometimes made that quantum theory must be nonlocal simply be- 
cause its predictions violate (24.10). But this is not correct. First, what follows 
logically from the violation of this inequality is that hidden variable theories, if 
they are to agree with quantum theory, must be nonlocal or embody some other 
peculiarity. But hidden variable theories by definition employ a different mathe- 
matical structure from (or in addition to) the quantum Hilbert space, so this tells us 
nothing about standard quantum mechanics. Second, the detailed quantum anal- 
ysis of a spin singlet system in Ch. 23 shows no evidence of nonlocality; indeed, 
it demonstrates precisely the opposite: the spin of particle b is not influenced in 
any way by the measurements carried out on particle a. (To be sure, in Ch. 23 we 
did not discuss how a measurement on particle a might influence the outcome of 
a measurement on particle b, but the argument can be easily extended to include 
that case, and the conclusion is exactly the same.) Hidden variable theories, on the 
other hand, can indeed be nonlocal. The Bohm theory mentioned in Sec. 24.3 is 
known to be nonlocal in a rather thorough-going way, and this is one reason why it 
has been difficult to construct a relativistic version of it. 

A third assumption which was made in deriving the inequality (24.10) is that 
the probability distribution p( X) for the hidden variable(s) X does not depend upon 
either w a or Wb . This seems plausible if there is a significant interval between the 
time when the two particles are prepared in some singlet state by a source which 
sets the value of X, and the time when the spin measurements occur. For w a and Wb 
could be chosen just before the measurements take place, and this choice should 
not affect the value of X determined earlier, unless the future can influence the 
past. 

In summary, the basic lesson to be learned from the Bell inequalities is that it is 
difficult to construct a plausible hidden variable theory which will mimic the sorts 
of correlations predicted by quantum theory and confirmed by experiment. Such a 
theory must either exhibit peculiar nonlocalities which violate relativity theory, or 
else incorporate influences which travel backwards in time, in contrast to everyday 
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experience. This seems a rather high price to pay just to have a theory which is 
more “classical” than ordinary quantum mechanics. 



25 

Hardy’s paradox 


25.1 Introduction 

Hardy’s paradox resembles the Bohm version of the Einstein-Podolsky-Rosen 
paradox, discussed in Chs. 23 and 24, in that it involves two correlated particles, 
each of which can be in one of two states. However, Hardy’s initial state is chosen 
in such a way that by following a plausible line of reasoning one arrives at a logical 
contradiction: something is shown to be true which one knows to be false. This 
makes this paradox in some respects more paradoxical than the EPR paradox as 
stated in Sec. 24. 1 . A paradox of a somewhat similar nature involving three spin- 
half particles was discovered (or invented) by Greenberger, Horne, and Zeilinger a 
few years earlier. The basic principles behind this GHZ paradox are very similar 
to those involved in Hardy’s paradox. We shall limit our analysis to Hardy’s para- 
dox, as it is a bit simpler, but the same techniques can be used to analyze the GHZ 
paradox. 

Hardy’s paradox can be discussed in the language of spin-half particles, but we 
will follow the original paper, though with some minor modifications, in thinking 
of it as involving two particles, each of which can move through one of two arms 
(the two arms are analogous to the two states of a spin-half particle) of an interfer- 
ometer, as indicated in Fig. 25.1. These are particles without spin, or for which the 
spin degree of freedom plays no role in the gedanken experiment. The source S 
at the center of the diagram produces two particles a and b moving to the left and 
right, respectively, in an initial state 

Wo) = (kd) + \cd) + \dc))/V3. (25.1) 

Here \cd) stands for |c) 0 | d), a state in which particle a is in the c arm of the 
left interferometer, and particle b in the d arm of the interferometer on the right. 
The other kets are defined in the same way. One can think of the two particles as 
two photons, but other particles will do just as well. In Hardy’s original paper one 
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particle was an electron and the other a positron, and the absence of a dd term in 
(25.1) was due to their meeting and annihilating each other. 

Suppose that S produces the state (25.1) at the time to- The unitary time develop- 
ment from to to a time t\, which is before either of the particles passes through the 
beam splitter at the output of its interferometer, is trivial: each particle remains in 
the same arm in which it starts out. We shall denote the states at t\ using the same 
symbol as at to'. |c), | d), etc. One could change c to c', etc., but this is really not 
necessary. In this simplified notation the time development operator for the time 
interval from to to t\ is simply the identity I. During the time interval from t\ to t%, 
each particle passes through the beam splitter at the exit of its interferometer, and 
these beam splitters produce unitary transformations 

B : \c) » (\e) + |/»/V2, | d) (-fe> + \f))/V2, 

B : |c) i-> (\e) + \f))/V2, \d) (- \e ) + |/»/V2, 

where \e ), etc., denote wave packets in the output channels, and the phases are 
chosen to agree with those used for the toy model in Sec. 12.1. Combining the 
transformations in (25.2) results in the unitary transformations 

\cc) (+]el) + \ef) + | fe) + ]//»/ 2, 

I cd) ^ {-\ee} + | ef) -\fe) + |//))/2, 

I dc) ^ (~\ee) - \ef) + \ fe) + |//»/ 2 , 

I dd) (+\ee) - \ef) - \ fe) + |//))/2, 

for the combined states of the two particles during the time interval from to or t\ to 
f 2 - Adding up the appropriate terms in (25.3), one finds that the initial state (25.1) 
is transformed into 

BB : |tfr 0 > (-\ee) + \ ef) + | fe) + 3| //)) /Vl2 (25.4) 


by the beam splitters. 



338 


Hardy’s paradox 


We will later need to know what happens if one or both of the beam splitters has 
been taken out of the way. Let O and O denote situations in which the left and the 
right beam splitters, respectively, have been removed. Then (25.2) is to be replaced 
with 

0:\c)^\f), \d)^\e), 

O : \c) I/), \d)H*\e), 

in agreement with what one would expect from Fig. 25.1. The time development 
of | ij/o) from to to t 2 if one or both of the beam splitters is absent can be worked out 
using (25.5) together with (25.2): 

BO : Wo) (\ee) + \fe) + 21//)) /Vh, 

OB : Wo) (|4 + | ef) + 2| //>) /Vh, (25.6) 

OO : Wo) ^ (\ef) + | fe) + \ //)) /V3. 

When they emerge from the beam splitters, the particles are detected, see 

Fig. 25.1. In order to have a compact notation, we shall use | M) for the initial 

state of the two detectors for particle a, and | M) that of the detectors for particle 
b, and assume that the process of detection corresponds to the following unitary 
transformations for the time interval from t 2 to ty. 

\e)\M)^\E), \f)\M)^\F), 

\e)\M)^\E), \f)\M)»\F). 

Thus | E) means that particle a was detected by the detector located on the e chan- 
nel. We are now ready to consider the paradox, which can be formulated in two 
different ways. Both of these are found in Hardy’s original paper, though in the 
opposite order. 


25,2 The first paradox 

For this paradox we suppose that both beam splitters are in place. Consider the 
consistent family of histories at the times to, h, and t 2 whose support consists of 
the four histories 

F\ : V^oO { c , d} © { e , /}, (25.8) 

with the same initial state x/fo, and one of the two possibilities c or d at t\, fol- 
lowed by e or / at t 2 . Here the symbols stand for projectors associated with the 
corresponding kets: c — |c)(c|, etc. That this family is consistent can be seen by 
noting that the unitary dynamics for particle a is independent of that for particle b 
at all times after to: the time development operator factors. Thus the Heisenberg 
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operators (see Sec. 1 1 .4) for c and d, which refer to the b particle, commute with 
those for e and /, which refer to the a particle, so that for purposes of checking 
consistency, (25.8) is the same as a history involving only two times: to and one 
later time. Hence one can apply the rule that a family of histories involving only 
two times is automatically consistent, Sec. 1 1.3. Of course, one can reach the same 
conclusion by explictly calculating the chain kets (Sec. 1 1.6) and showing that they 
are orthogonal to one another. 

The history fio Q c Q e has zero weight. To see this, construct the chain ket 
starting with 

cm = (I cc) + \dc))/V3 = (|c) + | d)) 0 |d)/V3. (25.9) 

When T (t 2 , t\ ) is applied to this, the result — see (25.2) — will be |/) times a ket 
for the b particle, and applying the projector e to it yields zero. As a consequence, 
since O d O e has finite weight, one has 

Pr(J, r, | e, t 2 ) = 1 , (25.10) 

where the times t\ and t 2 associated with the events d and e are indicated explicitly, 
rather than by subscripts as in earlier chapters. Thus if particle a emerges in e at 
time t 2 , one can be sure that particle b was in the d arm at time t\ . 

A similar result is obtained if instead of (25.8) one uses the family 

T[ : 4/ 0 ©{c, d}Q{E,F}, (25.11) 

with events at times to, t\, and t 2 , where 

|*o> = \MM) (25.12) 

includes the initial states of the measuring devices. The fact that 'To O c O E has 
zero weight implies that 

Pr(j, h\E, tj) = 1. (25.13) 

Of course, (25.13) is what one would expect, given (25.10), and vice versa: the 
measuring device shows that particle a emerged in the e channel if and only if 
this was actually the case. In the discussion which follows we will, because it is 
somewhat simpler, use f ami lies of the type which do not include any measuring 
devices. But the same sort of argument will work if instead of e, e, etc. one uses 
measurement outcomes E, E, etc. 

By symmetry it is clear that the family 


Ti : iAoO [c,d}Q{e,f], 


(25.14) 
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obtained by interchanging the role of particles a and b in (25.8), is consistent. Since 
the history i^o O c O e has zero weight, it follows that 

Pr (d, t\ | e, z 2 ) = 1 . (25.15) 

or, if measurements are included, 

Pr(d,tMA,h) = l. (25.16) 


That is, if particle b emerges in channel e (the measurement result is E), then 
particle a was earlier in the d and not the c arm of its interferometer. 

To complete the paradox, we need two additional f ami lies. Using 



: V^o O / O {ee, ef, fe, //}, 

(25.17) 

one can show, see (25.4), that 



Pr (ee,t 2 ) = 1/12. 

(25.18) 

Finally, the family 

Ta : V'b O {cc, cd, dc, dd} O I 

(25.19) 

yields the result 

Pr (dd, h) - 0, 

(25.20) 


because \dd) occurs with zero amplitude in |i/'o), (25.1). 

Hardy’s paradox can be stated in the following way. Whenever a emerges in the 
e channel we can be sure, (25.10), that b was earlier in the d arm, and whenever b 
emerges in the e channel we can be sure, (25.15), that a was earlier in the d arm. 
The probability that a will emerge in e at the same time that b emerges in e is 1/12, 
(25.18), and when this happens it must be true that a was earlier in d and b was 
earlier in d. But given the initial state \tyo), it is impossible for a to be in cl at the 
same time that b is in d, (25.20), so we have reached a contradiction. 

Here is a formal argument using probability theory. First, (25.10) implies that 

Pr(d, t\ | ee, t 2 ) = 1 , (25.21) 

because if a conditional probability is equal to 1, it will also be equal to 1 if the 
condition is made more restrictive, assuming the new condition has positive prob- 
ability. In the case at hand the condition e is replaced with ee, and the latter has a 
probability of 1/12, (25.18). In the same way, 


Pr (d, t\ | ee, t 2 ) = 1 


(25.22) 
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is a consequence of (25.15). Combining (25.21) and (25.22) leads to 

Pr(dd,tt\ee,t 2 ) = l, (25.23) 

and therefore, in light of (25.18), 

Pr(dd, h) > 1/12. (25.24) 

In Hardy’s original paper this version of the paradox was constructed in a some- 
what different way. Rather than using conditional probabilities to infer properties 
at earlier times, Hardy reasoned as follows, employing the version of the gedanken 
experiment in which there is a final measurement. Suppose that both interferome- 
ters are extremely large, so that the difference f 3 — 1\ is small compared to the time 
required for light to travel from the source to one of the beam splitters, or from one 
beam splitter to the other. (The choice of t\ in our analysis is somewhat arbitrary, 
but there is nothing wrong with choosing it to be just before the particles arrive at 
their respective beam splitters.) In this case there is a moving coordinate system 
or Lorentz frame in which relativistic effects mean that the detection of particle a 
in the e channel occurs (in this Lorentz frame) at a time when the h particle is still 
inside its interferometer. In this case, the inference from E tod can be made using 
wave function collapse, Sec. 18.2. By using a different Lorentz frame in which 
the b particle is the first to pass through its beam splitter, one can carry out the 
corresponding inference from E to d. Next, Hardy made the assumption that infer- 
ences of this sort which are valid in one Lorentz frame are valid in another Lorentz 
frame, and this justifies the analogs of (25.13) and (25.16). With these results in 
hand, the rest of the paradox is constructed in the manner indicated earlier, with a 
few obvious changes, such as replacing ee in (25.18) with EE. 


25.3 Analysis of the first paradox 

In order to arrive at the contradiction between (25.24) and (25.20), it is necessary 
to combine probabilities obtained using four different frameworks, (or their 

counterparts with the measuring apparatus included). While there is no difficulty 
doing so in classical physics, in the quantum case one must check that the cor- 
responding frameworks are compatible, that is, there is a single consistent family 
which contains all of the histories in T\-T\. However, it turns out that no two of 
these frameworks are mutually compatible. 

One way to see this is to note that the family 

Ji: xlr 0 Q{c,d}Q{e,f] (25.25) 

is inconsistent, as one can show by working out the chain kets and showing that 
they are not orthogonal. This inconsistency should come as no surprise in view of 
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the discussion of interference in Ch. 13, since the histories in J\ contain projectors 
indicating both which arm of the interferometer particle a is in at time t\ and the 
channel in which it emerges at t 2 . To be sure, the initial condition |i/ 4 i) is more 
complicated than its counterpart in Ch. 13, but it would have to be of a fairly 
special form in order not to give rise to inconsistencies. (It can be shown that each 
of the four histories in (25.25) is intrinsically inconsistent in the sense that it can 
never occur in a consistent family, Sec. 1 1.8.) Similarly, the family 

Ji : ^oO{c,d}Q{e,f} (25.26) 

is inconsistent. 

A comparison of T\ and T 2 , (25.8) and (25. 14), shows that a common refinement 
will necessarily include all of the histories in J\ , since c and d occur in J- 2 at t\, 
and e and / in T\ at t 2 . Therefore no co mm on refinement can be a consistent 
family, and T\ and T 2 are incompatible. In the same way, with the help of J\ 
and J 2 one can show that both T\ and T 2 are incompatible with and and 
that Ti, is incompatible with JT 4 . As a consequence of these incompatibilities, the 
derivation of (25.21) from (25.10) is invalid, as is the corresponding derivation of 
(25.22) from (25.15). 

Although J~ 2 and Ta are incompatible, there is a consistent family 

•F 5 : fa O [dd, I — dd] Q {ee, ef, fe, //} (25.27) 

from which one can deduce both (25.18) and (25.20). Consequently, the argument 
which results in a paradox can be constructed by combining results from only three 
incompatible families, T\, T 2 , and rather than four. But three is still two too 
many. 

It is worth pointing out that the defect we have uncovered in the argument in 
Sec. 25.2, the violation of consistency conditions, has nothing to do with any sort 
of mysterious long-range influence by which particle b or a measurement carried 
out on particle b somehow influences particle a, even when they are far apart. 
Instead, the basic incompatibility is to be found in the fact that J\, a family which 
involves only properties of particle a after the initial time to, is inconsistent. Thus 
the paradox arises from ignoring the quantum principles which govern what one 
can consistently say about the behavior of a single particle. 

A similar comment applies to Hardy’s original version of the paradox, for which 
he employed different Lorentz frames. Although relativistic quantum theory is out- 
side the scope of this book, it is worth remarking that there is nothing wrong with 
Hardy’s conclusion that the measurement outcome E for particle a implies that 
particle b was in the d arm of the interferometer before reaching the beam splitter 
B, even if some of the assumptions used in his argument, such as wave function 
collapse and the Lorentz invariance of quantum theory, might be open to dispute. 
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For his conclusion is the same as (25.13), a result obtained by straightforward ap- 
plication of quantum principles, without appealing to wave function collapse or 
Lorentz invariance. Thus the paradox does not, in and of itself, provide any indi- 
cation that quantum theory is incompatible with special relativity, or that Lorentz 
invariance fails to hold in the quantum domain. 

25.4 The second paradox 

In this formulation we assume that both beam splitters are in place, and then make a 
counterfactual comparison with situations in which one or both of them are absent 
in order to produce a paradox. In order to model the counterfactuals, we suppose 
that two quantum coins are connected to servomechanisms in the manner indicated 
in Sec. 19.4, one for each beam splitter. Depending on the outcome of the coin 
toss, each servomechanism either leaves the beam splitter in place or removes it at 
the very last instant before the particle arrives. 

Consider a family of histories with support 

3>o0/O {BB, BO, OB, OO} Q {EE, EF, FE, FF] (25.28) 

at times to < h < h < h, where t\ is a time before the quantum coin is tossed, 
t 2 a time after the toss and after the servomechanisms have done their work, but 
before the particles reach the beam splitters (if still present), and 1 3 a time after the 
detection of each particle in one of the output channels. Note that the definition of 
? 2 differs from that used in Sec. 25.3. The initial state | <J>o) includes the quantum 
coins, servomechanisms, and beam splitters, along with | r^o), (25.1), for particles 
a and b. 

Various probabilities can be computed with the help of the unitary transforma- 
tions given in (25.4), (25.6), and (25.7). For our purposes we need only the follow- 
ing results: 


Pr (EE, t 3 | BB, t 2 ) 

= 1 / 12 , 

(25.29) 

Pr (EF, t 

j| Bd,h) 

= 0 , 

(25.30) 

Pr (FE, l 

3 I OB, t 2 ) 

= 0 , 

(25.31) 

Pr(££\ t : 

s| OO, t 2 ) 

= 0 . 

(25.32) 


These probabilities can be used to construct a counterfactual paradox in the fol- 
lowing manner. 

HI. Consider a case in which BB occurs as a result of the quantum coin tosses, 
and the outcome of the final measurement on particle a is E. 

H2. Suppose that instead of being present, the beam splitter B had been absent, 
BO. The removal of a distant beam splitter at the last moment could not 
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possibly have affected the outcome of the measurement on particle a, so E 
would have occurred in case B O, just as it did in case B B. 

H3. Since by (25.30) EF is impossible in this situation, EE would have 
occurred in the case B O . 

H4. Given that E would have occurred in the case BO, it would also have oc- 
curred with both beam splitters absent, O O, since, once again, the removal 
of a distant beam splitter B at the last instant could not possibly have af- 
fected the outcome of a measurement on particle b. 

H5. It follows from H1-H4 that if E occurs in the case BB, then E would 
have occurred, in this particular experiment, if the quantum coin tosses had 
resulted in both beam splitters being absent, OO, rather than present. 

H6. Upon interchanging the roles of particles a and b in H1-H4, we conclude 
that if E occurs in the case BB, then E would have been the case had both 
beam splitters been absent, OO. 

H7. Consider a situation in which both E and E occur in the case BB; note that 
the probability for this is greater than 0, (25.29). Then in the counterfactual 
situation in which O O was the case rather than BB, we can conclude using 
H5 that E, and using H6 that E, would have occurred. That is, the outcome 
of the measurements would have been EE had the quantum coin tosses 
resulted in OO. 

H8. But according to (25.32), EE cannot occur in the case OO, so we have 
reached a contradiction. 


25.5 Analysis of the second paradox 

A detailed analysis of H1-H4 is a bit complicated, since both H2 and H4 involve 
counterfactuals, and the conclusion, stated in H5, comes from chaining together 
two counterfactual arguments. In order not to become lost in intricate details of 
how one counterfactual may be combined with another, it is best to focus on the 
end result in H5, which can be restated in the following way: If in the actual world 
the quantum coin tosses result in BB and the measurement outcome is E, then in a 
counterfactual world in which the coin tosses had resulted in OO, particle b would 
have triggered detector E. 

To support this argument using the scheme of counterfactual reasoning discussed 
in Sec. 19.4, we need to specify a single consistent family which contains the events 
we are interested in, which are the outcomes of the coin tosses and at least some 
of the outcomes of the final measurements, together with some event (or perhaps 
events) at a time earlier than when the quantum coins were tossed, which can serve 
as a suitable pivot. The framework might contain more than this, but it must contain 
at least this much. (Note that the pivot event or events can make reference to both 
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particles, and could be more complicated than simply the product of a projector for 
a times a projector for b.) From this point of view, the intermediate steps in the 
argument — for example, H2, in which only one of the beam splitters is removed 
— can be thought of as a method for finding the final framework and pivot through 
a series of intermediate steps. That is, we may be able to find a framework and 
pivot which will justify H2, and then modify the framework and choose another 
pivot, if necessary, in order to incorporate H3 and H4, so as to arrive at the desired 
result in H5. 

We shall actually follow a somewhat different procedure: make a guess for a 
framework which will support the result in H5, and then check that it works. An 
intelligent guess is not difficult, for E in case B B implies that the b particle was 
earlier in the d arm of its interferometer, (25.13), and when the beam splitter B is 
out of the way, a particle in d emerges in the e channel, which will result in E. This 
suggests taking a look at the consistent family containing the following histories, 
in which the alternatives c and cl occur at t \ : 


c O 


‘ho O 


do 


BB OF, 
OOQF. 

BB O {E,F}, 
OOQE. 


(25.33) 


The BO and OB branches have been omitted from (25.33) in order to save space 
and allow us to concentrate on the essential task of finding a counterfactual ar- 
gument which leads from BB to 00. Including these other branches terminated 
by a noncommittal I at 1 3 will turn (25.33) into the support of a consistent family 
without having any effect on the following argument. 

The consistency of (25.33) can be seen in the following way. The events BB 
and OO are macroscopically distinct, hence orthogonal, and since they remain un- 
changed from t 2 to ? 3 , we only need to check that the chain operators for the two 
histories involving O O are orthogonal to each other — as is obviously the case, 
since the final E and F are orthogonal — and the chain operators for the three his- 
tories involving BB we mutually orthogonal. The only conceivable problem arises 
because two of the BB histories terminate with the same projector F. However, 
because at earlier times these histories involve orthogonal states c and d of particle 
b, and F has to do with particle a (that is, a measurement on particle a), rather 
than b, the chain operators are, indeed, orthogonal. The reader can check this by 
working out the chain kets. 

One can use (25.33) to support the conclusion of H5 in the following way. The 
outcome E in the case BB occurs in only one history, on the third line in (25.33). 
Upon tracing this outcome back to d as a pivot, and then moving forward in time 
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on the OO branch we come to E as the counterfactual conclusion. Having ob- 
tained the result in H5, we do not need to discuss H2, H3, and H4. However, it 
is possible to justify these statements as well by adding a BO branch to (25.33) 
with suitable measurement outcomes at t 3 in place of the noncommittal I, and then 
adding some more events involving properties of particle a at time t\ in order to 
construct a suitable pivot for the argument in H2. As the details are not essential 
for the present discussion, we leave them as a (nontrivial) exercise for anyone who 
wishes to explore the argument in more depth. 

By symmetry, H6 can be justified by the use of a consistent family (with, once 
again, the B O and O B branches omitted) 


cQ 


d>o O 


do 


{BB OF, 

[OOOF, 

J BB 0{E,F}, 
[ OOQE , 


(25.34) 


which is (25.33) with the roles of a and b interchanged. However, H7, which 
combines the results of H5 and H6, is not valid, because the family (25.33) on 
which H5 is based is incompatible with the family (25.34) on which H6 is based. 
The problem with combining these two families is that when one introduces the 
events E and F at t 3 in the BB branch of a family which contains c and d at an 
earlier time, it is essentially the same thing as introducing e and / to make the 
inconsistent family J\, (25.25). In the same way, introducing E and F in the BB 
branch following an earlier c and d leads to trouble. Even the very first statement 
in H7, that EE occurs in case BB with a positive probability, requires the use of 
a family which is incompatible with both (25.33) and (25.34)! Thus the road to a 
contradiction is blocked by the single-framework rule. 

This procedure for blocking the second form of Hardy’s paradox is very similar 
to the one used in Sec. 25.3 for blocking the first form of the paradox. Indeed, for 
the case BB we have used essentially the same families; the only difference comes 
from the (somewhat arbitrary) decision to word the second form of the paradox in 
terms of measurement outcomes, and the first in a way which only makes reference 
to particle properties. 

The second form of Hardy’s paradox, like the first, cannot be used to justify 
some form of quantum nonlocality in the sense of some mysterious long-range 
influence of the presence or absence of a beam splitter in the path of one particle 
on the behavior of the other particle. Locality was invoked in H2 and H4 (and at 
the corresponding points in H6). But H2 and H4, as well as the overall conclusion 
in H5, can be supported by using a suitable framework and pivots. (We have only 
given the explicit argument for H5.) Thus, while our analysis does not prove that 
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the locality assumptions entering H2 and H4 are correct, it shows that there is no 
reason to suspect that there is anything wrong with them. The overall argument, 
H1-H8, results in a contradiction. However, the problem lies not in the locality 
assumptions in the earlier statements, but rather in the quantum incompatibility 
overlooked when writing down the otherwise plausible H7. This incompatibility, 
as noted earlier, has to do with the way a single particle is being described, so it 
cannot be blamed on anything nonlocal. 

Our analysis of H1-H6 was based upon particular frameworks. As there are a 
large number of different possible frameworks, one might suppose that an alter- 
native choice might be able to support the counterfactual arguments and lead to a 
contradiction. There is, however, a relatively straightforward argument to demon- 
strate that no single framework, and thus no set of compatible frameworks, could 
possibly support the argument in H1-H7. Consider any framework which con- 
tains BB at ?3 both in the case B B and also in the case OO. In this framework 
both (25.29) and (25.32) are valid: EE occurs with finite probability in case BB, 
and with zero probability in case OO. The reason is that even though (25.29) 
and (25.32) were obtained using the framework (25.28), it is a general principle 
of quantum reasoning, see Sec. 16.3, that the probability assigned to a collection 
of events in one framework will be precisely the same in all frameworks which 
contain these events and the same initial data (d>o in the case at hand). But in 
any single framework in which E E occurs with probability 0 in the case OO it is 
clearly impossible to reach the conclusion at the end of a series of counterfactual 
arguments that E E would have occurred with both beam splitters absent had the 
outcomes of the quantum coin tosses been different from what actually occurred. 

To be more specific, suppose one could find a framework containing a pivot P at 
t\ with the following properties: (i) P must have occurred if BB was followed by 
EE; (ii) if P occurred and was then followed by OO, the measurement outcome 
would have been EE. These are the properties which would permit this frame- 
work to support the counterfactual argument in H1-H7. But since BB followed 
by BB has a positive probability, the same must be true of P, and therefore OO 
followed by BB would also have to occur with a finite probability. (A more de- 
tailed analysis shows that Pr(BB, t$ \ OO, tf) would have to be at least as large 
as Pr(BB, t 3 1 BB, t 2 ).) However, since OO is, in fact, never followed by BB, a 
framework and pivot of this kind does not exist. 

The conclusion is that it is impossible to use quantum reasoning in a consis- 
tent way to arrive at the conclusion H7 starting from the assumption HI. In some 
respects the analysis just presented seems too simple: it says, in effect, that if a 
counterfactual argument of the form H1-H7 arrives at a contradiction, then this 
very fact means there is some way in which this argument violates the rules of 
quantum reasoning. Can one dispose of a (purported) paradox in such a summary 
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fashion? Yes, one can. The rule requiring that quantum reasoning of this type 
employ a single framework means that the usual rules of ordinary (classical) rea- 
soning and probability theory can be applied as long as one sticks to this particular 
framework, and there can be no contradiction. To put the matter in a different way, 
if there is some very clever way to produce this paradox using only one framework, 
then there will also be a corresponding “classical” paradox, and whatever it is that 
is paradoxical will not be unique to quantum theory. 

Nonetheless, there is some value in our working out specific aspects of the para- 
dox using the explicit families (25.33) and (25.34), for they indicate that the basic 
difficulty with the argument in H1-H8 lies in an implicit assumption that the dif- 
ferent frameworks are compatible, an assumption which is easy to make because 
it is always valid in classical mechanics. Incompatibility rather than some myste- 
rious nonlocality is the crucial feature which distinguishes quantum from classical 
physics, and ignoring it is what has led to a paradox. 
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26.1 Introduction 

Classical mechanics deals with objects which have a precise location and move in 
a deterministic way as a function of time. By contrast, quantum mechanics uses 
wave functions which always have some finite spatial extent, and the time devel- 
opment of a quantum system is (usually) random or stochastic. Nonetheless, most 
physicists regard classical mechanics as an approximation to quantum mechanics, 
an approximation which works well when the object of interest contains a large 
number of atoms. How can it be that classical mechanics emerges as a good ap- 
proximation to quantum mechanics in the case of large objects? 

Part of the answer to the question lies in the process of decoherence in which 
a quantum object or system interacting with a suitable environment (which is also 
quantum mechanical) loses certain types of quantum coherence which would be 
present in a completely isolated system. Even in classical physics the interaction of 
a system with its environment can have significant effects. It can lead to irreversible 
processes in which mechanical energy is turned into heat, with a resulting increase 
of the total entropy. Think of a ball rolling along a smooth, flat surface. Eventually 
it comes to rest as its kinetic energy is changed into heat in the surrounding air 
due to viscous effects, or dissipated as vibrational energy inside the ball or in the 
material which makes up the surface. (From this perspective the vibrational modes 
of the ball form part of its “environment”.) While decoherence is (by definition) 
quantum mechanical, and so lacks any exact analog in classical physics, it is closely 
related to irreversible effects. 

In this chapter we explore a very simple case, one might even think of it as a toy 
model, of a quantum particle interacting with its environment as it passes through 
an interferometer, in order to illustrate some of the principles which govern deco- 
herence. In the final section there are some remarks on how classical mechanics 
emerges as a limiting case of quantum mechanics, and the role which decoherence 
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plays in relating classical and quantum physics. The discussion of decoherence 
and of the classical limit of quantum mechanics presented here is only intended as 
an introduction to a complex subject. The bibliography indicates some sources of 
additional material. 


26.2 Particle in an interferometer 

Consider a particle passing through an interferometer, shown schematically in 
Fig. 26.1, in which an input beam in channel a is separated by a beam splitter 
into two arms c and d, and then passes through a second beam splitter into two 
output channels e and /. While this has been drawn as a Mach-Zehnder inter- 
ferometer similar to the interferometers considered in earlier chapters, it is best to 
think of it as a neutron interferometer or an interferometer for atoms. The prin- 
ciples of interference for photons and material particles are the same, but photons 
tend to interact with their environment in a different way. 



Fig. 26.1. Particle passing through an interferometer. 

Let us suppose that the interferometer is set up so that a particle entering through 
channel a always emerges in the / channel due to interference between the waves 
in the two arms c and d. As discussed in Ch. 13, this interference disappears if there 
is a measurement device in one or both of the arms which determines which arm 
the particle passes through. Even in the absence of a measuring device, the particle 
may interact with something, say a gas molecule, while traveling through one arm 
but not through the other arm. In this way the interference effect will be reduced if 
not entirely removed. One refers to this process as decoherence since it removes, 
or at least reduces the interference effects resulting from a coherent superposition 
of the two wave packets in the two arms. Sometimes one speaks metaphorically of 
the environment “measuring” which arm the particle passes through. 
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Assume that at the first beam splitter the particle state undergoes a unitary time 
development 

\a) (\c) + \d))/V2, (26.1) 

while passage through the arms of the interferometer results in 

14 | c'), | d) ^ | d% (26.2) 

Here | a) is a wave packet in the input channel at time to, \c) and | d) are wave pack- 
ets emerging from the first beam splitter in the c and d arms of the interferometer at 
time t\, and | c') and | d') are the corresponding wave packets at time f 2 just before 
they reach the second beam splitter. The effect of passing through the second beam 
splitter is represented by 

W) ^ (14 + |/»/V2, I d') ^ (-14 + l/»/V2, (26.3) 

where \e) and |/) are wave packets in the output channels of the second beam 
splitter at time 1 3 . The notation is chosen to resemble that used for the toy models 
in Sec. 12.1 and Ch. 13. 

Next assume that while inside the interferometer the particle interacts with some- 
thing in the environment in a way which results in a unitary transformation of the 
form 


1414 km, \d)\e)^\d')W'), (26.4) 

on the Hilbert space A 0 £ of the particle A and environment £, where |e) is the 
normalized state of £ at time t\, and |4) and |e") are normalized states at t 2 . For 
example, it might be the case that if the particle passes through the c arm some 
molecule is scattered from it resulting in the change from |e) to |4), whereas if 
the particle passes through the d arm there is no scattering, and the change in the 
environment from |e) to \e") is the same as it would have been in the absence of 
the particle. The complex number 

a — (e"|4) — a' + ia" , (26.5) 

with real and imaginary parts a' and a ", plays an important role in the following 
discussion. The final particle wave packets \c ') and | d') in (26.4) are the same as 
in the absence of any interaction with the environment, (26.2). That is, we are 
assuming that the scattering process has an insignificant influence upon the center 
of mass of the particle itself as it travels through either arm of the interferometer. 
This approximation is made in order to simplify the following discussion; one 
could, of course, explore a more complicated situation. 
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The complete unitary time evolution of the particle and its environment as the 
particle passes through the interferometer is given by 

\fo) = W)\e) ^ (\c) + \d))\e)/j2^ 

\fi) = (|c'>K) + | d')\e"))/V2 (26.6) 

lib) = [k)(|e'> - k"» + \f)(\e') + |e">)] /2, 

where we assume that the environment state |e) does not change between to and t\ . 
(This is not essential, and one could assume a different state, say |e), at to, which 
develops unitarily into |e) at t\.) Therefore, in the family with support [ 1 ^ 0 ] O 
7 © I © {\e\, [/]} the probabilities for the particle emerging in each of the output 
channels are given by 

Prk)=l(ki-(6"|)-(k')-k">) 

= KkV> + k"k") - <eV'> - {€*¥)) - 5(1 - «'), (26.7) 

Pr(/) = H<e'! + <e"|) • (|e'> + |€*» = \(l+ot’). 

Because the states entering the inner product in (26.5) are normalized, |a| can- 
not be greater than 1. If \e") — |e'), then a’ — a = I and there is no decoherence: 
the interference pattern is the same as in the absence of any interaction with the 
environment, and the particle always emerges in /. The interference effect dis- 
appears when a' = 0, and the particle emerges with equal probability in e or /. 
This could happen even with |a| rather large, for example, a — i. But in such a 
case there would still be a substantial coherence between the wave packets in the c 
and d arms, and the corresponding interference effect could be detected by shifting 
the second beam splitter by a small amount so as to change the difference in path 
length between the c and d arms by a quarter wavelength. Hence it seems sensible 
to use |a| rather than a' as a measure of coherence between the two arms of the 
interferometer, and 1 — |a| as a measure of the amount of decoherence. 


26.3 Density matrix 

In a situation in which one is interested in what happens to the particle after it 
passes through the second beam splitter without reference to the final state of the 
environment, it is convenient to use a density matrix pi for the particle at the inter- 
mediate time ti in (26.6), just before the particle passes through the second beam 
splitter. By taking a partial trace over the environment £ in the manner indicated 
in Sec. 15.3, one obtains 

Pi = Tr e {\yh)m) - Mk'Xc'l + \d'){d'\ +ot\c'){d'\ + a*\d'){c'\). (26.8) 



26.3 Density matrix 


353 


This has the form 



«/ 2 \ 

1 / 2 / 


(26.9) 


when written as a matrix in the basis {|c'}, | d')}, with {c'\p 2 \c') in the upper left 
comer. If we think of P 2 as a pre-probability, see Sec. 15.2, its diagonal elements 
represent the probability that the particle will be in the c or the d arm. Twice the 
magnitude of the off-diagonal elements serves as a convenient measure of coher- 
ence between the two arms of the interferometer, and thus 1 — |a| is a measure of 
the decoherence. 

After the particle passes through the second beam splitter, the density matrix is 
given by (see Sec. 15.4) 


P3 — t2)piTA(h, h), (26.10) 


where T^(t 3 , t 2 ) is the unitary transformation produced by the second beam splitter, 
(26.3), and we assume that during this process there is no further interaction of the 
particle with the environment. The result is 

Pi = \ [d - a')\e)(e\ + (1 + a')\f)(f\ + ia"(\e)(f\ - \f)(e \)] . (26.11) 

The diagonal parts of p 3 , the coefficients of \e)(e\ and |/)(/|, are the probabilities 
that the particle will emerge in the e or the / channel, and are, of course, identical 
with the expressions in (26.7). 

Using a density matrix is particularly convenient for discussing a situation in 
which the particle interacts with the environment more than once as it passes 
through the c or the d arm of the interferometer. The simplest situation to ana- 
lyze is one in which each of these interactions is independent of the others, and 
they do not alter the wave packet of the particle. In particular, let the environment 
consist of a number of separate pieces (e.g., separate molecules) with a Hilbert 
space 

£ — E\ 0 £2 0 £3 0 • • • £ n (26.12) 

and an initial state 


|e) = |ei) 0 |e 2 ) 0 |e 3 ) 0 • • • |e„) (26.13) 

at time t\. The / th interaction results in |e ; ) changing to |e' ) if the particle is in the 
c arm, or to |e") if the particle is in the d arm. Thus the net effect of all of these 
interactions as the particle passes through the interferometer is 

\e)\€) = fe}|€i)|e 2 > • • • |e„) ^ |c')|e') = \c')\e[)\€' 2 ) • • • |<), 

\d)\e) = \d)\e x )\e 2 ) • • • |e„) \d’)\e") - \d')\e'()\e''} • • • «>■ 


(26.14) 
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The reduced density matrix p 2 for the particle just before it passes through the 
second beam splitter is again of the form (26.8) or (26.9), with 

a = <e"|e / ) = a\a 2 - ■ -a„, (26.15) 

where 

otj = (e"jWj), (26.16) 

and ps, when the particle has passed through the second beam splitter, is again 
given by (26.11). 

In a typical situation one would expect the otj to be less than 1, though not 
necessarily small. Note that if there are a large number of collisions, a in (26.15) 
can be very small, even if the individual otj are not themselves small quantities. 
Thus repeated interactions with the environment will in general lead to greater 
decoherence than that produced by a single interaction, and if these interactions are 
of roughly the same kind, one expects the coherence |a| to decrease exponentially 
with the number of interactions. 

Even if the different interactions with the environment are not independent of 
one another, the net effect may well be much the same, although it might take more 
interactions to produce a given reduction of |a|. In any case, what happens at the 
second beam splitter, in particular the probability that the particle will emerge in 
each of the output channels, depends only on the density matrix p for the particle 
when it arrives at this beam splitter, and not on the details of all the scattering 
processes which have occurred earlier. For this reason, a density matrix is very 
convenient for analyzing the nature and extent of decoherence in this situation. 


26.4 Random environment 

Suppose that the environment which interacts with the particle is itself random, and 
that at time t\, when the particle emerges from the first beam splitter, it is described 
by a density matrix R\ which can be written in the form 

*i = I>M< e % (26.17) 

j 

where {|e 7 )} is an orthonormal basis of £, and J^Pj — 1- Although it is natural, 
and for many purposes not misleading to think of the environment as being in the 
state \e j ) with probability pj, we shall think of R\ as simply a pre-probability 
(Sec. 15.2). Assume that while the particle is inside the interferometer, during the 
time interval from t\ to t 2 , the interaction with the environment gives rise to unitary 
transformations 

\c)\e J )» W) \{ j ), | d)\€ j )^ \d')W), 


(26.18) 
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where we are again assuming, as in (26.4) and (26.14), that the environment has 
a negligible influence on the particle wave packets | c') and | d'). Because the time 
evolution is unitary, {|f *?)} and {|? 7 ; }} are orthonormal bases of £. 

Let the state of the particle and the environment at t\ be given by a density matrix 

= [a] 0 R\, (26.19) 

where |d) = (\c) + \d)) / \/2 is the state of the particle when it emerges from the first 
beam splitter. At t 2 , just before the particle leaves the interferometer, the density 
matrix resulting from unitary time evolution of the total system will be 

*2 = | X>;[ WW\ 0 \S j )m + \d'){d'\ 0 W)W\ 

j 

+\c')(d'\ 0 \S j )(r, j \ + \d'){c'\ 0 li/Wl]. (26.20) 

Taking a partial trace gives the expression 

P 2 = Tr £ 0P 2 ) = \[\c')(c'\ + \d')(d'\ + a\c')(d'\ + a*\d'){c'\\, (26.21) 

for the reduced density matrix of the particle at time t 2 , where 

a = Y^Pj{ri J \S } ). (26.22) 

j 

The expression (26.21) is formally identical to (26.8), but the complex parameter 
a is now a weighted average of a collection of complex numbers, the inner products 
{(rji\t;j)}, each with magnitude less than or equal to 1. Consider the case in which 

W\S l ) = j+>, (26.23) 

that is, the interaction with the environment results in nothing but a phase differ- 
ence between the wave packets of the particle in the c and d arms. Even though 
I I4T 7 ) I — 1 for every j, the sum (26.22) will in general result in |a| < 1, and if 
the sum includes a large number of random phases, |a| can be quite small. Hence 
a random environment can produce decoherence even in circumstances in which a 
nonrandom environment (as discussed in Secs. 26.2 and 26.3) does not. 

The basis {|e 7 )} in which R] is diagonal is useful for calculations, but does not 
actually enter into the final result for p 2 . To see this, rewrite (26.18) in the form 

|c) 0 |e) 1 -+ \c') 0 U c \e), \d) 0 |e) \d') 0 U d \e), (26.24) 

where |e) is any state of the environment, and U c and U f j are unitary transforma- 
tions on £ . Then (26.22) can be written in the form 

a = Tr f (/?if/Jt/ c ), (26.25) 

which makes no reference to the basis {|e 7 )}. 
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26.5 Consistency of histories 

Consider a family of histories at times to < h < ?2 < h with support 
Y ce = Mol o [cl o [c'l O [e], 

Y de = [x/rol O [d] O Id'] O [e], 

Y cf = Mol O [c] O [c'l O [/], 

= Mol o [d] O [</'] O [/], 


(26.26) 


where [t^o] is the initial state \a) |e) in (26.6), and the unitary dynamics is that of 
Sec. 26.2. The chain operators for histories which end in [ e ] are automatically 
orthogonal to those of histories which end in [/]. However, when the final states 
are the same, the inner products are 


(K(Y df ), K(Y cf )) = a/4 = ~{K{Y de ), K(Y ce )), (26.27) 


where a is the parameter defined in (26.5), which appears in the density matrix 
(26.8) or (26.9). Equation (26.27) is also valid in the case of multiple interactions 
with the environment, where a is given by (26.15). And it holds for the random 
environment discussed in Sec. 26.4, with a defined in (26.22), provided one re- 
defines the histories in (26.26) by eliminating the initial state MoL so that each 
history begins with [c] or \d\ at t\, and uses the density matrix 46 , (26.19), as an 
initial state at time t\ in the consistency condition (15.48). In this case the operator 
inner product used in (26.27) is (, }y 1 , as it involves the density matrix 4q, see 
(15.48). 

If there is no interaction with the environment, then a — 1 and (26.27) implies 
that the family (26.26) is not consistent. However, if a is very small, even though 
it is not exactly zero, one can say that the family (26.26) is approximately consis- 
tent, or consistent for all practical purposes, for the reasons indicated at the end 
of Sec. 10.2: one expects that by altering the projectors by small amounts one can 
produce a nearby family which is exactly consistent, and which has essentially the 
same physical significance as the original family. 

This shows that the presence of decoherence may make it possible to discuss 
the time dependence of a quantum system using a family of histories which in the 
absence of decoherence would violate the consistency condtions and thus not make 
sense. This is an important consideration when one wants to understand how the 
classical behavior of macroscopic objects is consistent with quantum mechanics, 
which is the topic of the next section. 


26.6 Decoherence and classical physics 

The simple example discussed in the preceding sections of this chapter illustrates 
two important consequences of decoherence: it can destroy interference effects, 
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and it can render certain families of histories of a subsystem consistent, or at least 
approximately consistent, when in the absence of decoherence such a family is in- 
consistent. There is an additional important effect which is not part of decoherence 
as such, for it can arise in either a classical or a quantum system interacting with its 
environment: the environment perturbs the motion of the system one is interested 
in, typically in a random way. (A classical example is Brownian motion, Sec. 8.1.) 

The laws of classical mechanics are simple, have an elegant mathematical form, 
and are quite unlike the laws of quantum mechanics. Nonetheless, physicists be- 
lieve that classical laws are only an approximation to the more fundamental quan- 
tum laws, and that quantum mechanics determines the motion of macroscopic 
objects made up of many atoms in the same way as it determines the motion of 
the atoms themselves, and that of the elementary particles of which the atoms are 
composed. However, showing that classical physics is a limiting case of quantum 
physics is a nontrivial task which, despite considerable progress, is not yet com- 
plete, and a detailed discussion lies outside the scope of this book. The following 
remarks are intended to give a very rough and qualitative picture of how the cor- 
respondence between classical and quantum physics comes about. More detailed 
treatments will be found in the references listed in the bibliography. 

A macroscopic object such as a baseball, or even a grain of dust, is made up of 
an enormous number of atoms. The description of its motion provided by classi- 
cal physics ignores most of the mechanical degrees of freedom, and focuses on a 
rather small number of collective coordinates. These are, for example, the center 
of mass and the Euler angles for a rigid body, to which may be added the vibra- 
tional modes for a flexible object. For a fluid, the collective coordinates are the 
hydrodynamic variables of mass and momentum density, thought of as obtained by 
“coarse graining” an atomic description by averaging over small volumes which 
still contain a very large number of atoms. It is important to note that the classi- 
cal description employs a very special set of quantities, rather than using all the 
mechanical degrees of freedom. 

It is plausible that properties represented by classical collective coordinates, such 
as “the mass density in region X has the value 7”, correspond to projectors onto 
subspaces of a suitable Hilbert space. These subspaces will have a very large di- 
mension, because the classical description is relatively coarse, and there will not 
be a unique projector corresponding to a classical property, but instead a collection 
of projectors (or subspaces), all of which correspond within some approximation 
to the same classical property. 

In the same way, a classical property which changes as a function of time will 
be associated with different projectors as time progresses, and thus with a quantum 
history. The continuous time variable of a classical description can be related to the 
discrete times of a quantum history in much the same way as a continuous classical 
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mass distribution is related to the discrete atoms of a quantum description. Just 
as a given classical property will not correspond to a unique quantum projector, 
there will be many quantum histories, and families of histories, which correspond 
to a given classical description of the motion, and represent it to a fairly good 
approximation. The term “quasi-classical” is used for such a quantum family and 
the histories which it contains. 

In order for a quasi-classical family to qualify as a genuine quantum description, 
it must satisfy the consistency conditions. Can one be sure that this is the case? 
Gell-Mann, Hartle, Brun, and Omnes (see references in the bibliography) have 
studied this problem, and concluded that there are some fairly general conditions 
under which one can expect consistency conditions to be at least approximately 
satisfied for quasi-classical families of the sort one encounters in hydrodynamics 
or in the motion of rigid objects. That such quasi-classical families will turn out 
to be consistent is made plausible by the following consideration. Any system of 
macroscopic size is constantly in contact with an environment. Even a dust par- 
ticle deep in interstellar space is bombarded by the cosmic background radiation, 
and will occasionally collide with atoms or molecules. In addition to an external 
environment of this sort, macroscopic systems have an internal environment con- 
stituted by the degrees of freedom left over when the collective coordinates have 
been specified. Both the external and the internal environment can contribute to 
processes of decoherence, and these can make it very hard to observe quantum in- 
terference effects. While the absence of interference, which is signaled by the fact 
that the density matrix of the subsystem is (almost) diagonal in a suitable repre- 
sentation, is not the same thing as the consistency of a suitable family of histories, 
nonetheless the two are related, as suggested by the example considered earlier 
in this chapter, where the same parameter a characterizes both the degree of co- 
herence of the particle when it leaves the interferometer, and also the extent to 
which certain consistency conditions are not fulfilled. The effectiveness of this 
kind of decoherence is what makes it very difficult to design experiments in which 
macroscopic objects, even those no bigger than large molecules, exhibit quantum 
interference. 

If a quasi-classical family can be shown to be consistent, will the histories in it 
obey, at least approximately, classical equations of motion? Again, this is a non- 
trivial question, and we refer the reader to the references in the bibliography for 
various studies. For example, Omnes has published a fairly general argument that 
classical and quantum mechanics give similar results if the quantum projectors cor- 
respond (approximately) to a cell in the classical phase space which is not too small 
and has a fairly regular shape, provided that during the time interval of interest the 
classical equations of motion do not result in too great a distortion of this cell. This 
last condition can break down rather quickly in the presence of chaos, a situation in 
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which the motion predicted by the classical equations depends in a very sensitive 
way upon initial conditions. 

Classical equations of motion are deterministic, whereas a quantum description 
employing histories is stochastic. How can these be reconciled? The answer is that 
the classical equations are idealizations which in appropriate circumstances work 
rather well. However, one must expect the motion of any real macroscopic system 
to show some effects of a random environment. The deterministic equations one 
usually writes down for classical collective coordinates ignore these environmental 
effects. The equations can be modified to allow for the effects of the environment 
by including stochastic noise, but then they are no longer deterministic, and this 
narrows the gap between classical and quantum descriptions. It is also worth keep- 
ing in mind that under appropriate conditions the quantum probability associated 
with a suitable quasi-classical history of macroscopic events can be very close to 
1 . These considerations would seem to remove any conflict between classical and 
quantum physics with respect to determinism, especially when one realizes that the 
classical description must in any case be an approximation to some more accurate 
quantum description. 

In conclusion, even though many details have not been worked out and much 
remains to be done, there is no reason at present to doubt that the equations of 
classical mechanics represent an appropriate limit of a more fundamental quantum 
description based upon a suitable set of consistent histories. Only certain aspects 
of the motion of macroscopic physical bodies, namely those described by appro- 
priate collective coordinates, are governed by classical laws. These laws provide 
an approximate description which, while quite adequate for many purposes, will 
need to be supplemented in some circumstances by adding a certain amount of 
environmental or quantum noise. 
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27.1 Introduction 

The connection between human knowledge and the real world to which it is (hope- 
fully) related is a difficult problem in philosophy. The purpose of this chapter is 
not to discuss the general problem, but only some aspects of it to which quantum 
theory might make a significant contribution. In particular, we want to discuss the 
question as to how quantum mechanics requires us to revise pre-quantum ideas 
about the nature of physical reality. This is still a very large topic, and space will 
permit no more than a brief discussion of some of the significant issues. 

Physical theories should not be confused with physical reality. The former are, 
at best, some sort of abstract or symbolic representation of the latter, and this is 
as true of classical physics as of quantum physics. The phase space used to repre- 
sent a classical system and the Hilbert space used for a quantum system are both 
mathematical constructs, not physical objects. Neither planets nor electrons inte- 
grate differential equations in order to decide where to go next. Wave functions 
exist in the theorist’s notebook and not, unless in some metaphorical sense, in the 
experimentalist’s laboratory. One might think of a physical theory as analogous 
to a photograph, in that it contains a representation of some object, but is not the 
object itself. Or one can liken it to a map of a city, which symbolizes the locations 
of streets and buildings, even though it is only made of paper and ink. 

We can comprehend (to some extent) with our minds the mathematical and log- 
ical structure of a physical theory. If the theory is well developed, there will be 
clear relationships among the mathematical and logical elements, and one can dis- 
cuss whether the theory is coherent, logical, beautiful, etc. The question of whether 
a theory is true, its relationship to the real world “out there”, is more subtle. Even if 
a theory has been well confirmed by experimental tests, as in the case of quantum 
mechanics, believing that it is (in some sense) a true description of the real world 
requires a certain amount of faith. A decision to accept a theory as an adequate, 
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or even as an approximate representation of the world is a matter of judgment 
which must inevitably move beyond issues of mathematical proof, logical rigor, 
and agreement with experiment. 

If a theory makes a certain amount of sense and gives predictions which agree 
reasonably well with experimental or observational results, scientists are inclined 
to believe that its logical and mathematical structure reflects the structure of the 
real world in some way, even if philosophers will remain permanently sceptical. 
Granted that all theories are eventually shown to have limitations, we nonethe- 
less think that Newton’s mechanics is a great improvement over that of Aristotle, 
because it is a much better reflection of what the real world is like, and that relativ- 
ity theory improves upon the science of Newton because space-time actually does 
have a structure in which light moves at the same speed in any inertial coordinate 
system. Theories such as classical mechanics and classical electromagnetism do a 
remarkably good job within their domains of applicability. How can this be under- 
stood if not by supposing that they reflect something of the real world in which we 
live? 

The same remarks apply to quantum mechanics. Since it has a consistent math- 
ematical and logical structure, and is in good agreement with a vast amount of 
observational and experimental data, it is plausible that quantum theory is a better 
reflection of what the real world is like than the classical theories which preceded 
it, and which could not explain many of the microscopic phenomena that are now 
understood using quantum methods. The faith of the physicist is that the real world 
is something like our best theories, and at the present time it is universally agreed 
that quantum mechanics is a very good theory of the physical world, better than 
any other currently available to us. 


27.2 Quantum vs. classical reality 

What are the main respects in which quantum mechanics differs from classical 
mechanics? To begin with, quantum theory employs wave functions belonging to 
a Hilbert space, rather than points in a classical phase space, in order to describe 
a physical system. Thus a quantum particle, in contrast to a classical particle, 
Secs. 2.3 and 2.4, does not possess a precise position or a precise momentum. 
In addition, the precision with which either of these quantities can be defined is 
limited by the Heisenberg uncertainty principle, (2.22). This does not mean that 
quantum entities are “fuzzy” and ill-defined, for a ray in the Hilbert space is as 
precise a specification as a point in phase space. What it does mean is that the clas- 
sical concepts of position and momentum can only be used in an approximate way 
when applied to the quantum domain. As pointed out in Sec. 2.4, the uncertainty 
principle refers primarily to the fact that quantum entities are described by a very 
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different mathematical structure than are classical particles, and only secondarily 
to issues associated with measurements. The limitations on measurements come 
about because of the nature of quantum reality, and the fact that what does not exist 
cannot be measured. 

A second respect in which quantum mechanics is fundamentally different from 
classical mechanics is that the basic classical dynamical laws are deterministic, 
whereas quantum dynamical laws are, in general, stochastic or probabilistic, so 
that the future behavior of a quantum system cannot be predicted with certainty, 
even when given a precise initial state. It is important to note that in quantum 
theory this unpredictability in a system’s time development is an intrinsic feature 
of the world, in contrast to examples of stochastic time development in classical 
physics, such as the diffusion of a Brownian particle (Sec. 8.1). Classical unpre- 
dictability arises because one is using a coarse-grained description where some 
information about the underlying deterministic system has been thrown away, and 
there is always the possibility, in principle, of a more precise description in which 
the probabilistic element is absent, or at least the uncertainties reduced to any ex- 
tent one desires. By contrast, the Bom rule or its extension to more complicated 
situations, Chs. 9 and 10, enters quantum theory as an axiom, and does not result 
from coarse graining a more precise description. To be sure, there have been ef- 
forts to replace the stochastic structure of quantum theory with something more 
a kin to the deter mi nism of classical physics, by supplementing the Hilbert space 
with hidden variables. But these have not turned out to be very fruitful, and, as dis- 
cussed in Ch. 24, the Bell inequalities indicate that such theories can only restore 
determinism at the price of introducing nonlocal influences violating the principles 
of special relativity. 

Of course, there is no reason to suppose that quantum mechanics as understood 
at the present time is the ultimate theory of how the world works. It could be that at 
some future date its probabilistic laws will be derived from a superior theory which 
returns to some form of determinism, but it is equally possible that future theories 
will continue to incorporate probabilistic time development as a fundamental fea- 
ture. The fact that it was only with great reluctance that physicists abandoned 
classical determinism in the course of developing a theory capable of explaining 
experimental results in atomic physics strongly suggests, though it does not prove, 
that stochastic time development is part of physical reality. 


27.3 Multiple incompatible descriptions 

The feature of quantum theory which differs most from classical physics is that it 
allows one to describe a physical system in many different ways which are incom- 
patible with one another. Under appropriate circumstances two (or more) incom- 
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patible descriptions can be said to be true in the sense that they can be derived in 
different incompatible frameworks starting from the same information about the 
system (the same initial data), but they cannot be combined in a single description, 
see Sec. 16.4. There is no really good classical analog of this sort of incompatibil- 
ity, which is very different from what we find in the world of everyday experience, 
and it suggests that reality is in this respect very different from anything dreamed 
of prior to the advent of quantum mechanics. 

As a specific example, consider the situation discussed in Sec. 18.4 using 
Fig. 18.4, where a nondestructive measurement of S z is carried out on a spin-half 
particle by one measuring device, and this is followed by a later measurement of 
S x using a second device. There is a framework T , (18.31), in which it is possi- 
ble to infer that at the time t\ when the particle was between the two measuring 
devices it had the property S z = + 1 /2, and another, incompatible framework Q, 
(18.33), in which one can infer the property S x — +1/2 at t \ . But there is no way 
in which these inferences, even though each is valid in its own framework, can 
be combined, for in the Hilbert space of a spin-half particle there is no subspace 
which corresponds to S z — +1/2 AND S x — +1/2, see Sec. 4.6. Thus we have 
two descriptions of the same quantum system which because of the mathematical 
structure of quantum theory cannot be combined into a single description. 

It is not the multiplicity of descriptions which distinguishes quantum from clas- 
sical mechanics, for multiple descriptions of the same object occur all the time in 
classical physics and in everyday life. A teacup has a different appearance when 
viewed from the top or from the side, and the side view depends on where the han- 
dle is located, but there is never any problem in supposing that these different de- 
scriptions refer to the same object. Or consider a macroscopic body which is spin- 
ning. One description might specify the z-component L z of its angular momentum, 
and another the x -component L x . In classical physics, two correct descriptions of 
a single object can always be combined to produce a single, more precise descrip- 
tion, and if this process is continued using all possible descriptions, the result will 
be a unique exhaustive description which contains each and every detail of every 
true description. In the case of a mechanical system at a single time, the unique 
exhaustive description corresponds to a single point in the classical phase space. 
Any true description can be obtained from the unique exhaustive description by 
coarsening it, that is, by omitting some of the details. Thus specifying a region 
in the phase space rather than a single point produces a coarser description of a 
mechanical system. 

For the purposes of the following discussion it is convenient to refer to the idea 
that there exists a unique exhaustive description as the principle of unicity, or sim- 
ply unicity. This principle implies that every conceivable property of a particular 
physical system will be either true or false, since it either is or is not contained in, 
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or implied by, the unique exhaustive description. Thus unicity implies the exis- 
tence of a universal truth functional as defined in Sec. 22.4. But as was pointed out 
in that section, there cannot be a universal truth functional for a quantum Hilbert 
space of dimension greater than 2. This is one of several ways of seeing that quan- 
tum theory is inconsistent with the principle of unicity, so that unicity is not part 
of quantum reality. It is the incompatibility of quantum descriptions which pre- 
vents them from being combined into a more precise description, and thus makes 
it impossible to create a unique exhaustive description. 

The difference between classical and quantum mechanics in this respect can be 
seen by considering a nondestructive measurement of L z for a macroscopic spin- 
ning body, followed by a later measurement of L x . Combining a description based 
upon the first measurement with one based on the second takes one two thirds of 
the way towards a unique exhaustive description of the angular momentum vec- 
tor. But trying to combine S z and S x values for a spin-half particle is, as already 
noted, an impossibility, and this means that these two descriptions cannot be ob- 
tained by coarsening a unique exhaustive quantum description, and therefore no 
such description exists. 

In order to describe a quantum system, a physicist must, of necessity, adopt 
some framework and this means choosing among many incompatible frameworks, 
no one of which is, from a fundamental point of view, more appropriate or more 
“real” than any other. This freedom of choice on the part of the physicist has 
occasionally been misunderstood, so it is worth pointing out some things which it 
does not mean. 

First, the freedom to use different incompatible frameworks in order to construct 
different incompatible descriptions does not make quantum mechanics a subjective 
science. Two physicists who employ the same framework will reach identical con- 
clusions when starting from the same initial data. More generally, they will reach 
the same answers to the same physical questions, even when some question can 
be addressed using more than one framework; see the consistency argument in 
Sec. 16.3. To use an analogy, if one physicist discusses L z for a macroscopic spin- 
ning object and another physicist L x , their descriptions cannot be compared with 
each other, but if both of them describe the same component of angular momentum 
and infer its value from the same initial data, they will agree. The same is true of 
S z and S x for a spin-half particle. 

Second, what a physicist happens to be thinking about when choosing a frame- 
work in order to construct a quantum description does not somehow influence 
reality in a manner akin to psychokinesis. No one would suppose that a physi- 
cist’s choosing to describe L z rather than L x for a macroscopic spinning body was 
somehow influencing the body, and the same holds for quantum descriptions of 
microscopic objects. Choosing an S z , rather than, say, an S x framework makes it 
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possible to discuss S z , but does not determine its value. Once the framework has 
been adopted it may be possible by logical reasoning, given suitable data, to infer 
that S z = +1/2 rather than —1/2, but this is no more a case of mind influencing 
matter than would be a similar inference of a value of L z for a macroscopic body. 

Third, choosing a framework T for constructing a description does not mean that 
some other description constructed using an incompatible framework Q is false. 
Quantum incompatibility is very different from the notion of mutually exclusive 
descriptions, where the truth of one implies the falsity of the other. Once again the 
analogy of classical angular momentum is helpful: a description which assigns a 
value to L z does not in any way render false a description which assigns a value to 
L x , even though it does exclude a description that assigns a different value to L z . 
The same comments apply to S z and S x in the quantum case. 

In order to avoid the mistake of supposing that incompatible descriptions are 
mutually exclusive, it is helpful to think of them as referring to different aspects of 
a quantum system. Thus using the S z framework allows the physicist to describe 
the “S z aspect” of a spin-half particle, which is quite distinct from the “S x aspect”. 
To be sure, one still has to remember that, unlike the situation in classical physics, 
two incompatible aspects cannot both enter a single description of a quantum sys- 
tem. While using an appropriate terminology and employing classical analogies 
are helpful for understanding the concept of quantum incompatibility, it remains 
true that this is one feature of quantum reality which is far easier to represent in 
mathematical terms than by means of a physical picture. 


27.4 The macroscopic world 

Our most immediate contact with physical reality comes from our sensory expe- 
rience of the macroscopic world: what we see, hear, touch, etc. A fundamental 
physical theory should, at least in principle, be able to explain the macroscopic 
phenomena we encounter in everyday life. But there is no reason why it must be 
built up entirely out of concepts from everyday experience, or restricted to everyday 
language. Modem physical theories posit all sorts of strange things, from quarks 
to black holes, that are totally alien to everyday experience, and whose description 
often requires some rather abstract mathematics. There is no reason to deny that 
such objects are part of physical reality, as long as they form part of a coherent the- 
oretical structure which can relate them, even somewhat indirectly, to things which 
are accessible to our senses. 

Two considerations suggest that quantum mechanics can (in principle) explain 
the world of our everyday experience in a satisfactory way. First, the macroscopic 
world can be described very well using classical physics. Second, as discussed in 
Sec. 26.6, classical mechanics is a good approximation to a fully quantum mechan- 
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ical description of the world in precisely those circumstances in which classical 
physics is known to work very well. This quantum description employs a quasi- 
classical framework in which appropriate macro projectors represent properties of 
macroscopic objects, and the relevant histories, which are well-approximated by 
solutions of classical equations of motion, are rendered consistent by a process of 
decoherence, that is, by interaction with the (internal or external) environment of 
the system whose motion is being discussed. 

It is important to note that all of the phenomena of macroscopic classical physics 
can be described using a single quasi-classical quantum framework. Within a single 
framework the usual rules of classical reasoning and probability theory apply, and 
quantum incompatibility, which has to do with the relationship between different 
frameworks, never arises. In this way one can understand why quantum incom- 
patibility is completely foreign to classical physics and invisible in the everyday 
world. (As pointed out in Sec. 26.6, there are actually many different quasi- 
classical frameworks, each of which gives approximately the same results for the 
macroscopic variables of classical physics. This multiplicity does not alter the 
validity of the preceding remarks, since a description can employ any one of these 
frameworks and still lead to the same classical physics.) 

Stochastic quantum dynamics can be reconciled with deterministic classical 
dynamics by noting that the latter is in many circumstances a rather good approx- 
imation to a quasi-classical history that the quantum system follows with high 
probability. Classical chaotic motion is an exception, but in this case classical 
dynamics, while in principle deterministic, is as a practical matter stochastic, since 
small errors in initial conditions are rapidly amplified into large and observable 
differences in the motion of the system. Thus even in this instance the situation is 
not much different from quantum dynamics, which is intrinsically stochastic. 

The relationship of quantum theory to pre-quantum physics is in some ways 
analogous to the relationship between special relativity and Newtonian mechanics. 
Space and time in relativity theory are related to each other in a very different 
way than in nonrelativistic mechanics, in which time is absolute. Nonetheless, as 
long as velocities are much less than the speed of light, nonrelativistic mechanics 
is an excellent approximation to a fully relativistic mechanics. One never even 
bothers to think about relativistic corrections when designing the moving parts 
of an automobile engine. The same theory of relativity that shows that the older 
ideas of physical reality are very wrong when applied to bodies moving at close 
to the speed of light also shows that they work extremely well when applied to 
objects which move slowly. In the same way, quantum theory shows us that our 
notions of pre-quantum reality are entirely inappropriate when applied to electrons 
moving inside atoms, but work extremely well when applied to pistons moving 
inside cylinders. 



27.4 The macroscopic world 


367 


However, quantum mechanics also allows the use of non-quasi-classical frame- 
works for describing macroscopic systems. For example, the macroscopic detec- 
tors which determine the channel in which a spin-half particle emerges from a 
Stem-Gerlach magnet, as discussed in Secs. 17.3 and 17.4, can be described by a 
quasi-classical framework 7F, such as (17.25), in which one or the other detector 
detects the particle, or by a non-quasi-classical framework Q in which the initial 
state develops unitarily into a macroscopic quantum superposition (MQS) state of 
the detector system. Is it a defect of quantum mechanics as a fundamental theory 
that it allows the physicist to use either of the incompatible frameworks T and Q 
to construct a description of this situation, given that MQS states of this sort are 
never observed in the laboratory? 

One must keep in mind the fact mentioned in the previous section that two 
incompatible quantum frameworks 7F and Q do not represent mutually-exclusive 
possibilities in the sense that if the world is correctly described by T it cannot be 
correctly described by Q, and vice versa. Instead it is best to think of T and Q as 
means by which one can describe different aspects of the quantum system, as sug- 
gested at the end of Sec. 27.3. To discuss which detector has detected the particle 
one must employ T, since the concept makes no sense in Q, whereas the “MQS 
aspect” or “unitary time development aspect” for which Q is appropriate makes 
no sense in T . Either framework can be employed to answer those questions for 
which it is appropriate, but the answers given by the two frameworks cannot be 
combined or compared. (Also see the discussion of Schrodinger’s cat in Sec. 9.6.) 

If one were trying to set up an experiment to detect the MQS state, then one 
would want to employ the framework Q, or, rather, its extension to a framework 
which included the additional measuring apparatus which would be needed to de- 
termine whether the detector system was in the MQS state or in some state orthog- 
onal to it. In fact, by using the principles of quantum theory one can argue that 
actual observations of MQS states are extremely difficult, even if “macroscopic” is 
employed somewhat loosely to include even an invisible grain of material contain- 
ing a few million atoms. The process of decoherence in such situations is extremely 
fast, and in any case constructing some apparatus sensitive to the relative phases in 
a macroscopic superposition is a practical impossibility. It may be helpful to draw 
an analogy with the second law of thermodynamics. Whereas there is nothing in 
the laws of classical (or quantum) mechanics which prevents the entropy of a sys- 
tem from decreasing as a function of time, in practice this is never observed, and the 
principles of statistical mechanics provide a plausible explanation through assign- 
ing an extremely small probability to violations of the second law. In a similar way, 
quantum mechanics can explain why MQS states are never observed in the labora- 
tory, even though they are very much a part of the fundamental theory, and hence 
also part of physical reality to the extent that quantum theory reflects that reality. 
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The difficulty of observing MQS states also explains why violations of the prin- 
ciple of unicity (see the previous section) are not seen in macroscopic systems, 
even though readily apparent in atoms. The breakdown of unicity is only apparent 
when one constructs descriptions using different incompatible frameworks, so it is 
never apparent if one restricts attention to a single framework. As noted earlier, 
classical physics works very well for a macroscopic system precisely because it is 
a good approximation to a quantum description based on a single quasi-classical 
framework. Hence even though quantum mechanics violates the principle of unic- 
ity, quantum mechanics itself provides a good explanation as to why that principle 
is always obeyed in classical physics, and its violation was neither observed nor 
even suspected before the advent of the scientific developments which led to quan- 
tum theory. 


27.5 Conclusion 

Quantum mechanics is clearly superior to classical mechanics for the description of 
microscopic phenomena, and in principle works equally well for macroscopic phe- 
nomena. Hence it is at least plausible that the mathematical and logical structure 
of quantum mechanics better reflect physical reality than do their classical counter- 
parts. If this reasoning is accepted, quantum theory requires various changes in our 
view of physical reality relative to what was widely accepted before the quantum 
era, among them the following: 

1 . Physical objects never possess a completely precise position or momentum. 

2. The fundamental dynamical laws of physics are stochastic and not deter- 
ministic, so from the present state of the world one cannot infer a unique 
future (or past) course of events. 

3. The principle of unicity does not hold: there is not a unique exhaustive de- 
scription of a physical system or a physical process. Instead, reality is such 
that it can be described in various alternative, incompatible ways, using 
descriptions which cannot be combined or compared. 

All of these, and especially the third, represent radical revisions of the pre- 
quantum view of physical reality based upon, or at least closely allied to classical 
mechanics. At the same time it is worth emphasizing that there are other respects 
in which the development of quantum theory leaves previous ideas about physical 
reality unchanged, or at least very little altered. The following is not an exhaustive 
list, but indicates a few of the ways in which the classical and quantum viewpoints 
are quite similar: 
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1 . Measurements play no fundamental role in quantum mechanics, just as they 
play no fundamental role in classical mechanics. In both cases, measure- 
ment apparatus and the process of measurement are described using the 
same basic mechanical principles which apply to all other physical objects 
and physical processes. Quantum measurements, when interpreted using 
a suitable framework, can be understood as revealing properties of a mea- 
sured system before the measurement took place, in a manner which was 
taken for granted in classical physics. See the discussion in Chs. 17 and 
18. (It may be worth adding that there is no special role for human con- 
sciousness in the quantum measurement process, again in agreement with 
classical physics.) 

2. Quantum mechanics, like classical mechanics, is a local theory in the sense 
that the world can be understood without supposing that there are mys- 
terious influences which propagate over long distances more rapidly than 
the speed of light. See the discussion in Chs. 23-25 of the EPR paradox, 
Bell’s inequalities, and Hardy’s paradox. The idea that the quantum world 
is permeated by superluminal influences has come about because of an in- 
adequate understanding of quantum measurements — in particular, the as- 
sumption that wave function collapse is a physical process — or through 
assuming the existence of hidden variables instead of (or in addition to) the 
quantum Hilbert space, or by employing counterfactual arguments which 
do not satisfy the single-framework rule. By contrast, a consistent applica- 
tion of quantum principles provides a positive demonstration of the absence 
of nonlocal influences, as in the example discussed in Sec. 23.4. 

3. Both quantum mechanics and classical mechanics are consistent with the 
notion of an independent reality, a real world whose properties and fun- 
damental laws do not depend upon what human beings happen to believe, 
desire, or think. While this real world contains human beings, among other 
things, it existed long before the human race appeared on the surface of the 
earth, and our presence is not essential for it to continue. 

The idea of an independent reality had been challenged by philosophers long 
before the advent of quantum mechanics. However, the difficulty of interpreting 
quantum theory has sometimes been interpreted as providing additional reasons for 
doubting that such a reality exists. In particular, the idea that measurements col- 
lapse wave functions can suggest the notion that they thereby bring reality into ex- 
istence, and if a conscious observer is needed to collapse the wave function (MQS 
state) of a measuring apparatus, this could mean that consciousness somehow plays 
a fundamental role in reality. However, once measurements are understood as no 
more than particular examples of physical processes, and wave function collapse 
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as nothing more than a computational tool, there is no reason to suppose that quan- 
tum theory is incompatible with an independent reality, and one is back to the 
situation which preceded the quantum era. To be sure, neither quantum nor clas- 
sical mechanics provides watertight arguments in favor of an independent reality. 
In the final analysis, believing that there is a real world “out there”, independent 
of ourselves, is a matter of faith. The point is that quantum mechanics is just as 
consistent with this faith as was classical mechanics. On the other hand, quantum 
theory indicates that the nature of this independent reality is in some respects quite 
different from what was earlier thought to be the case. 
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Hardy (1992a) is the original publication of his paradox. Mermin (1994) gives a 
very clear exposition of the basic idea. The GHZ paradox is explained in Green- 
berger et al. (1990). The counterfactual analysis in Sec. 25.5 extends Griffiths 
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Ch. 26. Decoherence and the classical limit 

A great deal has been written on the topic of decoherence and its relationship to 
the emergence of classical physics from quantum theory. For an introduction to 
the subject, see Zurek (1991). The book by Giulini et al. (1996) has contributions 
from diverse points of view and extensive references to earlier work. For work 
from a perspective close to the point of view found here, see Gell-Mann and Hartle 
(1993), Omnes (1999) (which gives references to earlier work), and Brun (1993, 
1994). 

For the argument of Omnes on the relationship of classical and quantum mechanics 
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See also: density matrix 

quantum coin, 262, 262-265, 268 
consistent family, 264 
simplified, 265 
unitary dynamics, 263 
quantum logic, 7, 61, 372 
“quantum probabilities”, 6, 75 
quantum reasoning, 216IT (Ch. 16), 374 
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See also: decomposition of identity 
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scalar, 27 

Schmidt decomposition, 85, 206, 372 
Schrodinger cat, 134-135, 373 
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Schrodinger equation, 94-98 
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time-independent H , 97 
See also: time development operators 
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Schrodinger, E., 373 
Schwabl, F„ 371 
self-adjoint, see Hermitian 
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simultaneous values, 296-297, 310, 323 
See also: value functional 
single family, see single framework 
single-framework rule, 70, 136, 217, 221, 222, 225, 
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applications, 277, 290, 295, 305-307, 325, 328, 
341-342, 344-348 
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and nonlocal influences, 369 
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spin half, 50-51, 60-62, 69 
angular momentum operator, 77 
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dynamics, 97 
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histories, 111, 114, 117 
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spin singlet, 310ff (Ch. 23) 
correlations, 311-313, 319 
histories, 313-315 
conditional probabilities, 313, 314 
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measurements, 315-322 
conditional probabilities, 315, 318, 319, 321 
wave function, 311 

See also: collapse of wave function, EPR 
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state, see density matrix, ket, stationary, 
superposition, wave function 
stationary state, 98 

statistical mechanics, 202, 203, 211, 236 
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conditional probabilities, 232 
consistent families, 231-232 
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apparatus, 249 
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unitary dynamics, 230 
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quantum vs. classical, 109 
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operators, 87-91 
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product of, 88 
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time-independent H, 102 
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